Reversing 6502 - Part 1: First Steps in Disassembly

After spending time building a 6502 emulator , I found myself wanting to go the other direction, instead of writing code and watching it execute, I wanted to take an existing C64 program and work backwards to understand what it does. Reverse engineering felt like the natural next step, and the 6502 turns out to be a fantastic target for learning disassembly. The instruction set is small, the addressing modes are well-documented, and there's no pipelining or speculative execution to worry about. What you see in memory is what the CPU executes.

In this two-part series, I'll walk through the process of disassembling a Commodore 64 program from a raw binary into readable, annotated assembly. Part 1 covers the foundations, the C64 memory map, how programs are loaded, and the mechanics of linear disassembly. In Part 2, we'll tackle the harder problems: distinguishing code from data, identifying subroutines, and using tools to automate the tedious parts.

Why the C64?

The Commodore 64 is the ideal platform for learning reverse engineering. The entire system is well-documented, the memory map, the I/O chips (VIC-II for graphics, SID for sound, CIA for timers and I/O), and the KERNAL/BASIC ROMs are all thoroughly mapped out by decades of community effort. There are no operating system abstractions to deal with, no dynamic linking, no virtual memory. A C64 program talks directly to hardware through fixed memory addresses. When I started looking into this, I was struck by how much easier it is to reason about a system where everything has a known, fixed address.

The C64 Memory Map

Before disassembling anything, you need to understand what lives where. The C64's 64 KiB address space is shared between RAM, ROM, and I/O registers, with a banking scheme that lets the CPU switch between them.

The default layout when a program is running:

Address Range	Contents
0x0000–0x00FF	Zero page, fast access variables, pointers
0x0100–0x01FF	Hardware stack
0x0200–0x03FF	OS workspace (input buffer, screen line pointers)
0x0400–0x07FF	Default screen memory (40×25 characters)
0x0800–0x9FFF	BASIC program area / free RAM
0xA000–0xBFFF	BASIC ROM (banked, RAM underneath)
0xC000–0xCFFF	Free RAM
0xD000–0xD3FF	VIC-II graphics chip registers
0xD400–0xD7FF	SID sound chip registers
0xD800–0xDBFF	Colour RAM
0xDC00–0xDCFF	CIA1, keyboard, joystick, timer
0xDD00–0xDDFF	CIA2, serial bus, NMI, timer
0xE000–0xFFFF	KERNAL ROM (banked, RAM underneath)

Here's what clicked for me when I started studying this: the memory map is the API. There's no system call interface, if a program wants to change the border colour, it writes to 0xD020. If it wants to play a sound, it writes to the SID registers at 0xD400–0xD418. Recognising these addresses in disassembled code immediately tells you what the program is doing, without needing any documentation about the program itself.

The banking scheme is controlled by the processor port at 0x0001. By writing different values, the CPU can swap ROM out and access the RAM underneath. Machine language programs commonly bank out BASIC ROM (and sometimes KERNAL ROM) to reclaim that address space for code or data. If you see writes to 0x0001 early in a program, that's what's happening.

PRG File Format

Most C64 programs you'll encounter are in PRG format. It's beautifully simple, the first two bytes are the load address (little-endian), and everything after that is raw data to be loaded at that address.

Offset  Content
------  -------
00-01  Load address (low byte, high byte)
02-...  Program data (loaded starting at the load address)

A PRG file starting with 01 08 loads at 0x0801, which is the start of the BASIC program area. This is extremely common, most programs include a short BASIC stub that auto-runs the machine language portion with a SYS command.

A typical BASIC stub looks like this in memory at 0x0801:

0801: 0B 08      ; pointer to next BASIC line (080B)
0803: 0A 00      ; line number 10
0805: 9E         ; BASIC token for SYS
0806: 32 30 36 31 ; ASCII "2061"
080A: 00         ; end of line
080B: 00 00      ; end of program (null pointer)
080D: ...        ; machine language starts here

The BASIC line 10 SYS 2061 tells the BASIC interpreter to jump to address 2061 (0x080D), where the actual machine code begins. When you're disassembling a PRG file, the first thing to do is identify this entry point. The SYS address is your starting point for disassembly.

Not all PRGs load at 0x0801. Cartridge images, games that load below the BASIC area, and multi-part loaders can have different load addresses. Always check the first two bytes.

Linear Disassembly

The simplest approach to disassembly is linear, start at the entry point and decode each instruction sequentially, advancing by the instruction's byte length.

The 6502 makes this straightforward because every opcode maps to a fixed instruction length:

Addressing Mode	Bytes	Example
Implied / Accumulator	1	`RTS`, `INX`, `ASL A`
Immediate	2	`LDA #42`
Zero Page	2	`LDA 10`
Zero Page,X / Zero Page,Y	2	`LDA 10,X`
Absolute	3	`LDA 1234`
Absolute,X / Absolute,Y	3	`LDA 1234,X`
(Indirect,X)	2	`LDA (10,X)`
(Indirect),Y	2	`LDA (10),Y`
Relative (branches)	2	`BEQ 05`
Indirect (JMP only)	3	`JMP (FFFC)`

Given an opcode byte, you look up the instruction and addressing mode, read the appropriate number of operand bytes, format the output, and advance. Here's what a linear pass over a simple program looks like:

080D: A9 00       LDA #00        ; load 0 into accumulator
080F: 8D 20 D0    STA D020       ; store to border colour register
0812: 8D 21 D0    STA D021       ; store to background colour register
0815: A2 00       LDX #00        ; X = 0
0817: BD 28 08    LDA 0828,X     ; load byte from message table
081A: F0 06       BEQ 0822       ; if zero (end of string), branch forward
081C: 9D 00 04    STA 0400,X     ; store to screen memory
081F: E8          INX             ; X++
0820: D0 F5       BNE 0817       ; loop back (branch if X != 0)
0822: 4C 22 08    JMP 0822       ; infinite loop (halt)
0825: 00          BRK

Even from this raw disassembly, you can read the program's intent: it sets the border and background to black, then copies a null-terminated string from 0x0828 to screen memory at 0x0400. The JMP 0822 to itself is a common C64 idiom for halting, there's no HALT instruction on the 6502, so an infinite jump-to-self serves the same purpose.

Recognising Hardware Interaction

This is where the C64's fixed memory map becomes incredibly useful. When you see certain addresses in STA or LDA instructions, you immediately know what the program is doing:

Address	Register	Meaning
0xD020	VIC-II	Border colour
0xD021	VIC-II	Background colour
0xD011	VIC-II	Screen control (scroll, mode, screen on/off)
0xD016	VIC-II	Screen control (scroll, multicolour mode)
0xD018	VIC-II	Memory pointers (character set, screen memory location)
0xD015	VIC-II	Sprite enable register
0xD400–0xD414	SID	Sound registers (frequency, waveform, ADSR, filter)
0xDC00	CIA1	Keyboard column / joystick port 2
0xDC01	CIA1	Keyboard row / joystick port 1
0xDD00	CIA2	VIC bank selection, serial bus
0x0400–0x07E7	,	Screen character memory (default)
0xD800–0xDBE7	,	Colour RAM for each screen character
0x0001	CPU	Processor port (ROM/RAM banking)
0x0314–0x0315	,	IRQ vector (low/high byte)

When I started annotating disassembled code with these register names, the programs became dramatically more readable. A sequence like:

LDA #01
STA D015
LDA #A0
STA D000
LDA #A0
STA D001

Goes from opaque hex to obvious: enable sprite 0, set its X position to 160, set its Y position to 160. The hardware register table is essentially a Rosetta Stone for C64 disassembly.

KERNAL and BASIC ROM Calls

The C64's KERNAL ROM provides a jump table of standard routines at fixed addresses. When you see JSR to one of these addresses, you can immediately name the subroutine:

Address	Name	Purpose
0xFFD2	CHROUT	Output a character to the current output device
0xFFE4	GETIN	Get a character from the keyboard buffer
0xFFE1	STOP	Check if the STOP key is pressed
0xFFC0	OPEN	Open a logical file
0xFFC3	CLOSE	Close a logical file
0xFFC6	CHKIN	Set input channel
0xFFC9	CHKOUT	Set output channel
0xFFCF	CHRIN	Input a character from the current input device
0xFFE7	CLALL	Close all files
0xFF81	CINT	Initialise screen editor
0xFF84	IOINIT	Initialise I/O devices
0xFF87	RAMTAS	Initialise RAM, set tape buffer pointer

A JSR FFD2 is the C64 equivalent of putchar(). Seeing a loop that loads bytes from a table and calls FFD2 for each one tells you it's printing a string. These KERNAL calls are one of the fastest ways to understand what a program is doing at a high level.

The Limits of Linear Disassembly

Linear disassembly works well for simple, straight-line code, but it falls apart quickly with real programs. The fundamental problem is that it can't distinguish code from data.

Consider this:

0830: 20 40 08    JSR 0840       ; call subroutine
0833: 48 65 6C 6C ; "Hell"        ← this is DATA, not code
0837: 6F 00       ; "o\0"
0839: 60          RTS

A linear disassembler starting at 0x0830 will correctly decode the JSR, but then it hits 0x0833 and tries to decode 48 as an instruction (PHA), 65 6C as ADC 6C, and so on. The output is nonsense because those bytes are a string, not instructions. The subroutine at 0x0840 is designed to read the return address from the stack, use it as a pointer to the string, advance past the string, and push the new return address back, a common C64 trick for inline string data.

Data tables embedded in code are everywhere in C64 programs, sprite data, character sets, lookup tables for multiplication, sine wave tables for scrolling effects, level maps. A linear disassembler will blindly decode all of it as instructions, producing garbage output that obscures the real code.

The other problem is branches and jumps. When a BEQ or JMP redirects execution, the bytes immediately after the branch might be data, dead code, or the target of a different branch. Linear disassembly has no way to know.

In Part 2, we'll tackle these problems head-on, using control flow analysis to follow the actual execution paths, heuristics for identifying data regions, and tools that automate much of this work.

Getting Started: A Practical Workflow

If you want to try this yourself, here's the workflow I've settled on:

Examine the PRG header. Read the first two bytes to determine the load address. If it's 0801, look for a BASIC stub and find the SYS address.
Identify the entry point. The SYS address is where machine code execution begins. This is your starting address for disassembly.
Do a linear pass. Decode instructions sequentially from the entry point. Don't worry about accuracy yet, this gives you a rough map of the code.
Annotate hardware addresses. Replace raw addresses with register names (0xD020 → VIC_BORDER_COLOUR, 0xFFD2 → KERNAL_CHROUT). This makes the code dramatically more readable.
Mark obvious data. Sequences of printable ASCII bytes are strings. Blocks of bytes after the last RTS or JMP that don't decode to sensible instructions are likely data tables.
Follow subroutine calls. Each JSR target is a new entry point. Disassemble those routines separately.

This manual process is slow, but it builds intuition. Once you understand what you're looking for, the tools in Part 2 will make much more sense.

What's Next

In Part 2 , we'll move beyond linear disassembly into control flow analysis, following branches and jumps to build a map of which bytes are actually executed as code. We'll cover techniques for identifying data tables, recognising common C64 programming patterns (raster interrupts, self-modifying code, multiplexed sprites), and the tools that make serious disassembly practical.