Reversing 6502 - Part 1: First Steps in Disassembly
After spending time building a 6502 emulator , I found myself wanting to go the other direction, instead of writing code and watching it execute, I wanted to take an existing C64 program and work backwards to understand what it does. Reverse engineering felt like the natural next step, and the 6502 turns out to be a fantastic target for learning disassembly. The instruction set is small, the addressing modes are well-documented, and there's no pipelining or speculative execution to worry about. What you see in memory is what the CPU executes.
In this two-part series, I'll walk through the process of disassembling a Commodore 64 program from a raw binary into readable, annotated assembly. Part 1 covers the foundations, the C64 memory map, how programs are loaded, and the mechanics of linear disassembly. In Part 2, we'll tackle the harder problems: distinguishing code from data, identifying subroutines, and using tools to automate the tedious parts.
Why the C64?
The Commodore 64 is the ideal platform for learning reverse engineering. The entire system is well-documented, the memory map, the I/O chips (VIC-II for graphics, SID for sound, CIA for timers and I/O), and the KERNAL/BASIC ROMs are all thoroughly mapped out by decades of community effort. There are no operating system abstractions to deal with, no dynamic linking, no virtual memory. A C64 program talks directly to hardware through fixed memory addresses. When I started looking into this, I was struck by how much easier it is to reason about a system where everything has a known, fixed address.
The C64 Memory Map
Before disassembling anything, you need to understand what lives where. The C64's 64 KiB address space is shared between RAM, ROM, and I/O registers, with a banking scheme that lets the CPU switch between them.
The default layout when a program is running:
| Address Range | Contents |
|---|---|
| 0x0000–0x00FF | Zero page, fast access variables, pointers |
| 0x0100–0x01FF | Hardware stack |
| 0x0200–0x03FF | OS workspace (input buffer, screen line pointers) |
| 0x0400–0x07FF | Default screen memory (40×25 characters) |
| 0x0800–0x9FFF | BASIC program area / free RAM |
| 0xA000–0xBFFF | BASIC ROM (banked, RAM underneath) |
| 0xC000–0xCFFF | Free RAM |
| 0xD000–0xD3FF | VIC-II graphics chip registers |
| 0xD400–0xD7FF | SID sound chip registers |
| 0xD800–0xDBFF | Colour RAM |
| 0xDC00–0xDCFF | CIA1, keyboard, joystick, timer |
| 0xDD00–0xDDFF | CIA2, serial bus, NMI, timer |
| 0xE000–0xFFFF | KERNAL ROM (banked, RAM underneath) |
Here's what clicked for me when I started studying this: the memory map is the API. There's no system call interface, if a program wants to change the border colour, it writes to 0xD020. If it wants to play a sound, it writes to the SID registers at 0xD400–0xD418. Recognising these addresses in disassembled code immediately tells you what the program is doing, without needing any documentation about the program itself.
The banking scheme is controlled by the processor port at 0x0001. By writing different values, the CPU can swap ROM out and access the RAM underneath. Machine language programs commonly bank out BASIC ROM (and sometimes KERNAL ROM) to reclaim that address space for code or data. If you see writes to 0x0001 early in a program, that's what's happening.
PRG File Format
Most C64 programs you'll encounter are in PRG format. It's beautifully simple, the first two bytes are the load address (little-endian), and everything after that is raw data to be loaded at that address.
Offset Content
------ -------
00-01 Load address (low byte, high byte)
02-... Program data (loaded starting at the load address)
A PRG file starting with 01 08 loads at 0x0801, which is the start of the BASIC program area. This is extremely common, most programs include a short BASIC stub that auto-runs the machine language portion with a SYS command.
A typical BASIC stub looks like this in memory at 0x0801:
0801: 0B 08 ; pointer to next BASIC line (080B)
0803: 0A 00 ; line number 10
0805: 9E ; BASIC token for SYS
0806: 32 30 36 31 ; ASCII "2061"
080A: 00 ; end of line
080B: 00 00 ; end of program (null pointer)
080D: ... ; machine language starts here
The BASIC line 10 SYS 2061 tells the BASIC interpreter to jump to address 2061 (0x080D), where the actual machine code begins. When you're disassembling a PRG file, the first thing to do is identify this entry point. The SYS address is your starting point for disassembly.
Not all PRGs load at 0x0801. Cartridge images, games that load below the BASIC area, and multi-part loaders can have different load addresses. Always check the first two bytes.
Linear Disassembly
The simplest approach to disassembly is linear, start at the entry point and decode each instruction sequentially, advancing by the instruction's byte length.
The 6502 makes this straightforward because every opcode maps to a fixed instruction length:
| Addressing Mode | Bytes | Example |
|---|---|---|
| Implied / Accumulator | 1 | RTS, INX, ASL A |
| Immediate | 2 | LDA #42 |
| Zero Page | 2 | LDA 10 |
| Zero Page,X / Zero Page,Y | 2 | LDA 10,X |
| Absolute | 3 | LDA 1234 |
| Absolute,X / Absolute,Y | 3 | LDA 1234,X |
| (Indirect,X) | 2 | LDA (10,X) |
| (Indirect),Y | 2 | LDA (10),Y |
| Relative (branches) | 2 | BEQ 05 |
| Indirect (JMP only) | 3 | JMP (FFFC) |
Given an opcode byte, you look up the instruction and addressing mode, read the appropriate number of operand bytes, format the output, and advance. Here's what a linear pass over a simple program looks like:
080D: A9 00 LDA #00 ; load 0 into accumulator
080F: 8D 20 D0 STA D020 ; store to border colour register
0812: 8D 21 D0 STA D021 ; store to background colour register
0815: A2 00 LDX #00 ; X = 0
0817: BD 28 08 LDA 0828,X ; load byte from message table
081A: F0 06 BEQ 0822 ; if zero (end of string), branch forward
081C: 9D 00 04 STA 0400,X ; store to screen memory
081F: E8 INX ; X++
0820: D0 F5 BNE 0817 ; loop back (branch if X != 0)
0822: 4C 22 08 JMP 0822 ; infinite loop (halt)
0825: 00 BRK
Even from this raw disassembly, you can read the program's intent: it sets the border and background to black, then copies a null-terminated string from 0x0828 to screen memory at 0x0400. The JMP 0822 to itself is a common C64 idiom for halting, there's no HALT instruction on the 6502, so an infinite jump-to-self serves the same purpose.
Recognising Hardware Interaction
This is where the C64's fixed memory map becomes incredibly useful. When you see certain addresses in STA or LDA instructions, you immediately know what the program is doing:
| Address | Register | Meaning |
|---|---|---|
| 0xD020 | VIC-II | Border colour |
| 0xD021 | VIC-II | Background colour |
| 0xD011 | VIC-II | Screen control (scroll, mode, screen on/off) |
| 0xD016 | VIC-II | Screen control (scroll, multicolour mode) |
| 0xD018 | VIC-II | Memory pointers (character set, screen memory location) |
| 0xD015 | VIC-II | Sprite enable register |
| 0xD400–0xD414 | SID | Sound registers (frequency, waveform, ADSR, filter) |
| 0xDC00 | CIA1 | Keyboard column / joystick port 2 |
| 0xDC01 | CIA1 | Keyboard row / joystick port 1 |
| 0xDD00 | CIA2 | VIC bank selection, serial bus |
| 0x0400–0x07E7 | , | Screen character memory (default) |
| 0xD800–0xDBE7 | , | Colour RAM for each screen character |
| 0x0001 | CPU | Processor port (ROM/RAM banking) |
| 0x0314–0x0315 | , | IRQ vector (low/high byte) |
When I started annotating disassembled code with these register names, the programs became dramatically more readable. A sequence like:
LDA #01
STA D015
LDA #A0
STA D000
LDA #A0
STA D001
Goes from opaque hex to obvious: enable sprite 0, set its X position to 160, set its Y position to 160. The hardware register table is essentially a Rosetta Stone for C64 disassembly.
KERNAL and BASIC ROM Calls
The C64's KERNAL ROM provides a jump table of standard routines at fixed addresses. When you see JSR to one of these addresses, you can immediately name the subroutine:
| Address | Name | Purpose |
|---|---|---|
| 0xFFD2 | CHROUT | Output a character to the current output device |
| 0xFFE4 | GETIN | Get a character from the keyboard buffer |
| 0xFFE1 | STOP | Check if the STOP key is pressed |
| 0xFFC0 | OPEN | Open a logical file |
| 0xFFC3 | CLOSE | Close a logical file |
| 0xFFC6 | CHKIN | Set input channel |
| 0xFFC9 | CHKOUT | Set output channel |
| 0xFFCF | CHRIN | Input a character from the current input device |
| 0xFFE7 | CLALL | Close all files |
| 0xFF81 | CINT | Initialise screen editor |
| 0xFF84 | IOINIT | Initialise I/O devices |
| 0xFF87 | RAMTAS | Initialise RAM, set tape buffer pointer |
A JSR FFD2 is the C64 equivalent of putchar(). Seeing a loop that loads bytes from a table and calls FFD2 for each one tells you it's printing a string. These KERNAL calls are one of the fastest ways to understand what a program is doing at a high level.
The Limits of Linear Disassembly
Linear disassembly works well for simple, straight-line code, but it falls apart quickly with real programs. The fundamental problem is that it can't distinguish code from data.
Consider this:
0830: 20 40 08 JSR 0840 ; call subroutine
0833: 48 65 6C 6C ; "Hell" ← this is DATA, not code
0837: 6F 00 ; "o\0"
0839: 60 RTS
A linear disassembler starting at 0x0830 will correctly decode the JSR, but then it hits 0x0833 and tries to decode 48 as an instruction (PHA), 65 6C as ADC 6C, and so on. The output is nonsense because those bytes are a string, not instructions. The subroutine at 0x0840 is designed to read the return address from the stack, use it as a pointer to the string, advance past the string, and push the new return address back, a common C64 trick for inline string data.
Data tables embedded in code are everywhere in C64 programs, sprite data, character sets, lookup tables for multiplication, sine wave tables for scrolling effects, level maps. A linear disassembler will blindly decode all of it as instructions, producing garbage output that obscures the real code.
The other problem is branches and jumps. When a BEQ or JMP redirects execution, the bytes immediately after the branch might be data, dead code, or the target of a different branch. Linear disassembly has no way to know.
In Part 2, we'll tackle these problems head-on, using control flow analysis to follow the actual execution paths, heuristics for identifying data regions, and tools that automate much of this work.
Getting Started: A Practical Workflow
If you want to try this yourself, here's the workflow I've settled on:
- Examine the PRG header. Read the first two bytes to determine the load address. If it's
0801, look for a BASIC stub and find theSYSaddress. - Identify the entry point. The
SYSaddress is where machine code execution begins. This is your starting address for disassembly. - Do a linear pass. Decode instructions sequentially from the entry point. Don't worry about accuracy yet, this gives you a rough map of the code.
- Annotate hardware addresses. Replace raw addresses with register names (0xD020 →
VIC_BORDER_COLOUR, 0xFFD2 →KERNAL_CHROUT). This makes the code dramatically more readable. - Mark obvious data. Sequences of printable ASCII bytes are strings. Blocks of bytes after the last
RTSorJMPthat don't decode to sensible instructions are likely data tables. - Follow subroutine calls. Each
JSRtarget is a new entry point. Disassemble those routines separately.
This manual process is slow, but it builds intuition. Once you understand what you're looking for, the tools in Part 2 will make much more sense.
What's Next
In Part 2 , we'll move beyond linear disassembly into control flow analysis, following branches and jumps to build a map of which bytes are actually executed as code. We'll cover techniques for identifying data tables, recognising common C64 programming patterns (raster interrupts, self-modifying code, multiplexed sprites), and the tools that make serious disassembly practical.