Table of Contents
TinyC for the SD-8516
Introduction
- February 2026.
A self-hosted C compiler has been my goal for quite some time. I did BASIC and Forth first. BASIC was useful, Forth – not so much (from a practical standpoint). Enter SD/CC – Stellar Dynamics C compiler. I just got sick of not having one so I wrote one. The key that helped me start was hearing something about the accumulator being used to return values – like the TOS in Forth. This was something I had been thinking about for months. And so, I wrote a parser that required int main(void) { return 0; } and emitted LDA #0 and RET. The debugger was simply printing the return value.
March 2026
At the end of March I was able to get something set up where it could compile:
int main(void) { return 0; }
The problem was that isn't very much. Weeks went by and I couldn't figure it out. It was supposed to be just like the BASIC expression parser. A recursive descent parser. So the first thing I added was additive and multiplicative expressions. Then suddenly it started working. I honestly don't know what changed because it wasn't working then suddenly it was and I couldn't remember changing anything in the code.
The key takeaway here is the concept of TOKENS and an EMITTER. The parser can consume TOKENS. This is a more abstract notion than keyword string matching in Stellar BASIC. More automated. The emitter is a bit like in forth but more granular. I have 'emitter' functions and they act like lego blocks. I have to call them in order and they will dump bytes to the stream. This part was taken from the javascript assembler.
cc_parse_expr:
; Parse left operand
CALL @cc_parse_term
; Check error
LDAL [@CC_ERROR]
CMP AL, #0
JNZ @cc_pe_done
cc_pe_loop:
; Check for '+' or '-'
LDAL [@CC_TOKEN_TYPE]
CMP AL, TOK_PLUS
JZ @cc_pe_add
CMP AL, TOK_MINUS
JZ @cc_pe_sub
This works for + and -, we can return 3+5 now. Next I added negative numbers. I had to add a check for TOK_MINUS first and then emit a SUB A, B to get it to negate.
Integers
- April 14th, 2026.
I've been trying to get a symbol table up and to define int; but it isn't working. I think there is an off by one bug on the stack. Storing variables on the stack is something I don't fully understand or trust yet.
What is really biting me in the ass is the system I came up with to pair 16-bit registers into 24-bit pointers. It's a charmingly retro, period-authentic thing. But it makes coding a bit hard when you need multiple pointers. I have a 'secret weapon' – the ability to add more registers any time I want, but, I want to avoid going down that route if possible. Having so many registers already is supposed to be good enough.
So what's a symbol table?
in BASIC we have 26 letters for integer variables. It's like that. It's more like that than the NVAR variable system. But the point is, it's for ints. So it's a lot like Stellar BASIC's “int letters”. It maps names to ints. The C source says a, and the compiler needs to know: what is a? Where does it live? What type is it? This is the problem LVAR/DVAR tried to solve and mainly failed. But, they led to this. This is like “S-VAR”. But for C not BASIC. In an interpreter like BASIC the mapping exists at runtime. Every time execution hits LET A = 5, the interpreter searches NVAR for “A” and stores the value 5 there. Every time it encounters PRINT A, it searches NVAR again and reads 5 back. The name-to-value lookup happens over and over, on every execution of every line.
A compiler eliminates that. During compilation, it resolves “a” to a stack location once, emits a machine instruction with the location hardcoded, and never thinks about the name “a” again. Like a label in the assembler. The compiled program doesn't contain the string “a” anywhere — just LDA [GLZ-2]. The symbol table is the structure that makes this one-time resolution possible.
Each entry is a 12-byte record laid out like this:
- Offset 0-7: Name (null-terminated, padded to 8 bytes)
- Offset 8: Type (0=int)
- Offset 9: Kind (0=local)
- Offset 10-11: Value (16-bit, the stack frame offset)
The table lives at CC_SYMTAB and grows upward. CC_SYM_COUNT tracks how many entries exist.
Compiling ''int a = 5;''
CC_FRAME_OFFSETgoes from 0 to 2 (each int is 2 bytes)cc_sym_addis called with name=“a”, type=0, kind=0, value=2- It computes the slot address:
CC_SYMTAB + (count * 12) = $00F040 - Writes “a\0\0\0\0\0\0\0” (name padded to 8), then 0x00 (type), 0x00 (kind), 0x02 0x00 (value = 2, little-endian)
- Increments
CC_SYM_COUNTto 1
Next for int b = 3;
CC_FRAME_OFFSETgoes from 2 to 4- Slot address: $00F040 + (1 * 12) = $00F04C
- Writes “b\0…” with value=4
When return a; needs the variable:
- .
cc_sym_finddoes a linear scan from $00F040 - . Compares each record's name against
CC_TOK_BUFusing strcmp - . Finds “a” at record 0, returns a pointer (usu. BLY or LLZ) pointing to that record
- . The caller reads the value field at BLY+10, gets 2
- . Emits
LDA [GLZ-2]— load from frame pointer minus 2
When a new .compile starts:
cc_sym_clear sets CC_SYM_COUNT to 0. Like calling NVAR_CLEAR on RUN in BASIC. The old data is still in memory, but invisible since nothing scans past count.
The table is purely compile-time. The compiled program has no idea it exists. Variable names are resolved to numeric stack offsets during compilation and baked into the emitted instructions.
Using the Stack for Ints
What I have been mulling over for the last couple months is using the stack to hold local variables. It's supposed to be something called a stack frame. I hear about it, again, while I was implementing Forth. It was some kind of 'crazy thing C did,' or something. But it stuck in my mind. This was the key breakthrough. There were a lot of disparate concepets floating around but I finally nailed it down. it was a symbol table that stored handles (like a handle table in D-VAR system) but it was stored on the stack.
Frankly I don't know who the marketing genius is who thought all this up. I think if we could go back in time things would have worked differently. This is some real alien technology stuff IMO. SO let it revolve around in your head. Why on the stack?
Why Stack Frames?
The stack is used because local variables need to appear when a function is entered and disappear when it returns, and the stack naturally provides that lifetime.
When int main(void) { int a = 5; int b = 3; return a + b; } is compiled, the emitted code does this:
PUSH GLZ ; save old frame pointer (3 bytes on stack) MOV GLZ, SP ; GLZ = current stack top — this is our reference point SUB SP, #2 ; reserve 2 bytes for 'a' — SP moves down SUB SP, #2 ; reserve 2 bytes for 'b' — SP moves down again
The stack now looks like this (growing downward):
┌───────────────────────┐
GLZ-2 │ return info │
├───────────────────────┤
GLZ points │ saved GLZ (3 bytes) │
├───────────────────────┤
GLZ-2 │ a (2 bytes) │
├───────────────────────┤
GLZ-4 │ b (2 bytes) │
├───────────────────────┤
SP points │ (free space) │
└───────────────────────┘
a is always at GLZ-2. b is always at GLZ-4. These offsets are fixed at compile time and baked into the instructions. STA [GLZ-2] stores into a. LDA [GLZ-4] reads from b.
But why male models?
(But why on the stack?) You could assign a to address $1000 and b to $1002. For a single function that's called once, it would work. The problem is recursion. Consider:
int factorial(int n) {
if (n == 0) return 1;
return n * factorial(n - 1);
}
If n lives at a fixed address, the second call to factorial overwrites the first call's n. When the inner call returns and the outer call tries to read n, the value is gone. Every call needs its own private copy of n. So you see, the real problem that is being solved here is memory management. The stack is a single source of 'pre-allocated memory' that we can use without bumping into other things, without a lot of complex mechanics. We could, in fact, define a heap area (ex. NVAR) but isn't that just… a stack? Boom, big reveal. They're all stacks, deep down. Even a linked list (INSQUE). It's just an unsorted stack with pointers to traverse the nodes in order.
The stack (the PUSH/POP stack) solves this automatically. Each call to factorial executes PUSH GLZ / MOV GLZ, SP / SUB SP, #2, which carves out a new region of stack below the previous call's region. Five nested calls means five separate copies of n at five different stack addresses, each accessible through its own saved GLZ value. When a call returns, MOV SP, GLZ / POP GLZ releases that region and restores the previous frame pointer. The caller's variables reappear at their original locations.
Working through if-else
To get a new language feature like if-else, we have to follow the same pattern as in BASIC.
- Add it to the lexer (match keyword).
- Parse the statement (done in the executor in BASIC, but a separate step here).
- Emitter – This is the big diff, since we don't do it 'now' but we have to do it 'later' (at runtime).
The plan is the exact same as assembly (labels); back-patching, technically a two-pass, but done in the moment so we say it's one pass.
- Emit the condition (cc_parse_expr — result in A)
- Emit CMP A, #0
- Emit JZ with a placeholder address — save CC_HERE before writing it
- Parse the then-block (emits its code)
- Go back and patch the JZ address to point here (past the block)
for if-else:
- Emit condition (same)
- Emit CMP A, #0 (same)
- JZ placeholder (same)
- Parse then-block (same)
- Emit JMP placeholder (skip over else)
- Patch placeholder to here
- Parse else-block
- Patch placeholder to here
Documenting the process
Just like adding a keyword in BASIC.
1. Add the tokens.
.equ TOK_IF 17
.equ TOK_ELSE 18
2. Add the strings.
cc_kw_if:
.bytes "if", 0
cc_kw_else:
.bytes "else", 0
3. Add the keyword checks in the keyword checks section.
; Check "if"
LDELM CC_TOK_BUF
LDFLD @cc_kw_if
LDAH $02
INT $12
JZ @cc_nt_kw_if
; Check "else"
LDELM CC_TOK_BUF
LDFLD @cc_kw_else
LDAH $02
INT $12
JZ @cc_nt_kw_else
And the handlers,
cc_nt_kw_if:
LDAL TOK_IF
STAL [@CC_TOKEN_TYPE]
POP I
RET
cc_nt_kw_else:
LDAL TOK_ELSE
STAL [@CC_TOKEN_TYPE]
POP I
RET
We have to add it to the parser:
cc_parse_stmt:
LDAL [@CC_TOKEN_TYPE]
CMP AL, TOK_RETURN
JZ @cc_parse_return
CMP AL, TOK_INT
JZ @cc_parse_vardecl
CMP AL, TOK_IDENT
JZ @cc_parse_assign
CMP AL, TOK_IF ; Added here at the end
JZ @cc_parse_if
; Unknown statement
LDAL #1
STAL [@CC_ERROR]
LDELM @cc_str_expect_stmt
LDAH $18
INT $10
RET
Finally here is the if parser. Note three important changes from BASIC.
- Instead of manually checking for things like ) the system uses a TOKEN table (TOK_LPAREN, etc). This generalized approach, I think, adds to thinkability and expandability.
- You don't see int05_skip_space everywhere. Instead we have a cc_skip_ws which is included in cc_next_token. This isn't as 'inlined' as basic, but it doesn't need to be since it's a compiler. Keeping things like skip_space out of the way aids code understandability.
- It's an emitter not an interpreter. Instead of doing it “now” we need to have a plan to do it 'later'.
; --- cc_parse_if ---
; 'if' '(' expr ')' block ('else' block)?
;
; Emitted code shape (if only):
; <expr> ← result in A
; CMP_IMM A, #0 ← test condition
; JZ patch1 ← skip then-block if false
; <then-block>
; patch1: ← JZ lands here
;
; Emitted code shape (if/else):
; <expr>
; CMP_IMM A, #0
; JZ patch1 ← skip to else if false
; <then-block>
; JMP patch2 ← skip over else
; patch1: ← else starts here
; <else-block>
; patch2: ← both paths rejoin here
;
cc_parse_if:
; Consume 'if'
CALL @cc_next_token
; Expect '('
LDAL TOK_LPAREN
CALL @cc_expect
LDAL [@CC_ERROR]
CMP AL, #0
JNZ @cc_pi_done
; Parse condition expression (result in A)
CALL @cc_parse_expr
LDAL [@CC_ERROR]
CMP AL, #0
JNZ @cc_pi_done
; Expect ')'
LDAL TOK_RPAREN
CALL @cc_expect
LDAL [@CC_ERROR]
CMP AL, #0
JNZ @cc_pi_done
; Emit: CMP_IMM A, #0 (opcode=91, reg=0, imm16=0,0)
LDAL #91
CALL @cc_emit_byte
LDAL #0 ; REG_A
CALL @cc_emit_byte
LDAL #0 ; imm16 lo = 0
CALL @cc_emit_byte
LDAL #0 ; imm16 hi = 0
CALL @cc_emit_byte
; Emit: JZ <placeholder> — save address of the operand for patching
LDAL #101 ; JZ opcode
CALL @cc_emit_byte
; Save CC_HERE — this is where the 3-byte address will go
LDBLX [@CC_HERE]
PUSH BLX ; *** patch1 address saved on stack ***
; Emit 3 placeholder bytes
LDAL #0
CALL @cc_emit_byte
CALL @cc_emit_byte
CALL @cc_emit_byte
; Parse then-block
CALL @cc_parse_block
LDAL [@CC_ERROR]
CMP AL, #0
JNZ @cc_pi_bail1
; Check for 'else'
LDAL [@CC_TOKEN_TYPE]
CMP AL, TOK_ELSE
JZ @cc_pi_else
; No else = patch1 = CC_HERE (jump past then-block)
POP BLX ; BLX = patch1 location
CALL @cc_backpatch
JMP @cc_pi_done
cc_pi_else:
; Consume 'else'
CALL @cc_next_token
; Emit: JMP <placeholder> at end of then-block
LDAL #100 ; JMP opcode
CALL @cc_emit_byte
LDBLY [@CC_HERE]
PUSH BLY ; *** patch2 address saved on stack ***
LDAL #0
CALL @cc_emit_byte
CALL @cc_emit_byte
CALL @cc_emit_byte
; We need patch1 now. Swap:
POP BLY ; BLY = patch2
POP BLX ; BLX = patch1
PUSH BLY ; re-push patch2
CALL @cc_backpatch ; patch1 = current CC_HERE
; Parse else-block
CALL @cc_parse_block
LDAL [@CC_ERROR]
CMP AL, #0
JNZ @cc_pi_bail1
; Patch2 = CC_HERE (JMP lands past else-block)
POP BLX ; BLX = patch2 location
CALL @cc_backpatch
cc_pi_done:
RET
cc_pi_bail1:
POP BLX ; discard patch address
RET
For a backpatch helper (in case we need this elsewhere) we just copy HERE:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; backpatch helper
cc_backpatch:
PUSH ELM
LDELM [@CC_HERE]
STELM [FLD]
POP ELM
RET
Why not inline the backpatch helper?
Point of order. This is small enough to inline. And just like in the CPU, maybe I will end up inlining all the fetch_byte() calls. But this is a compiler – we are not shooting for speed here, we are skating on the thin ice of a codebase which is bound to balloon in terms of complexity. Documentation and small, clean functions are king here. So, the point is being made by separating this out into a function.
while is easy
Once we have backpatching and the expression machinery, every new control flow construct is just a new arrangement of jumps. for would be the same. The mantra is “All loops are possible with goto” (See: |more loops from the SD-8516 User's Guide).
After while, the critical path to self-hosting is:
- Functions and function calls
- Pointers / char*
- Array indexing
Functions
Ahh the old y = x. F(x) = x^2.
Opening up the discussion of functions opens up the discussion of forward referencing. But, I'm not going to worry about it because, remember, this C is only so we can build a self-hosting compiler and iterate on it. This version will be thrown away as soon as it can compile something better – in C.
more to come
Storing functions on the symbol table
The symbol table is 16 bits and pointers here are 16 bits. We can hack this because we can assume everything will fit in the same bank. Later in the next version this won't matter, we will just make it 32 bits.
more to come
Pointers, char *
My take on pointers is that they will be ints on the symbol table. Like functions and ints. All locations on the table. Same with CHAR types? but char needs to be 1 byte, I think. But char* is just a pointer we look at like a char. more to come
Since we're on a 16 bit symbol table, a pointer is a 16-bit address (same size as int). Same storage, same stack slot. The difference is how we treat it. We treat this as a pointer to within the same bank. So for this TinyC all pointers are within the same bank as the code.
Three things needed:
&xaddress-of operator*pdereference read
The pointer value is in A. Build a 24-bit pointer in a scratch register to load a value. Ex. TLK.
*p = exprdereference write
Parse the expression (value in A), get the pointer into B, store through (ex. TLK).
For char * the difference is it's width at load. char * is 8 bits, int * is 16 here.
Strings
We should be able to do strings with char*. Char * was the next thing after int *.
It was the usual. add TOK_CHAR. cc_kw_char. etc.
The usual pointer clobber bugs.
Globals
Can fake these with ints I think. more to come
Array indexing
Decomposes into pointers IIRC. more to come
