Table of Contents
SD-8516 PPU
This is a short reference to the XY-2000 PPU.
Introduction: PPIXEL (plot_pixel)
| Method | Pixels per Second |
|---|---|
| BASIC PIXEL command | 2,100 |
| INT 18h plot pixel | 26,000 |
| INT 18h plot pixel (no bounds check) | 30,000 |
| PPU via INT 0x18 | 90,000 |
| PPU via INT 0x03 (direct) | 220,000 |
| PPIXEL (PPU via opcode) | 460,000 |
The number shows the pixels per second of the screen being cleared in a tight loop. This essentially represents the fastest practical use for plot pixel; read X, Y and C data in a loop and plot it. If we unroll the loop by 20 times, the difference in speed between the INT 3 version and the OP.PPIXEL version remains about 2x, but you lose the loop overhead. The opcode version remains relatively 2x faster. This makes sense, since every call to INT 0x03 requires a LDA $0101 to access plot_pixel; the PPIXEL command gives this for free. Therefore PPIXEL is always at least twice as faster than INT 0x03 plot_pixel. It's actually a bit faster since it does not cross the host-guest bride inside the opcode call.
- However, if all you are doing is loading XYC data and calling INT $03, then theoretically their performance will be within 5%-10% of each other; INT touches memory moreso than PPIXEL directly, but this amounts to an edge case; 31 vs 30 active sprites; not worth the enginering headache.
460,000 pixels/second is 7,600 pixels per frame at 60fps. That is almost exactly 30 16×16 sprites. Suggested use is to MEMCOPY a pre-drawn background into the frame-buffer and then draw sprites with PPIXEL (if you want to use PPIXEL for sprites). This way you don't have to draw the whole screen, and you don't have to draw the sprite background. It is hard to say how many sprites you will actually be able to draw in a frame because each sprite has more or less empty background space.
This initial test proves that a PPU construct has value, that it can serve as a drop-in boost to the code in INT 18h, that it should be initially implemented as a subcall of AH=$01, INT $03, and that the move to a dedicated opcode's practical value is 95% code density and 5% improved speed.
Code Replacement
PPIXEL can be called via INT 0x03:
LDA $0100 ; AH = 1 (PPU dispatch), AL = 00 (plot_pixel) INT $03 ; plot_pixel(X, Y, C)
PPU plot_pixel replaces the old plot pixel function in the graphics library (function #1 in INT 18h). This is some of the oldest code in the entire system, likely carried over from the SD-8510/VC-2:
- [I:J] addressing mode
- Manual bank 2 register load (LDI #2)
- Register pressure PUSH X, PUSH C
- Overuse of LDA at start, then shows LDB before end
- Only uses A,B,X,Y,I,J,K,T
This was one of the first routines ever written for the original VC-2, as a test to draw characters to a terminal screen. This was at the time I was constructing the cpu emulator itself; during the writing of this function I added the LDB, PAB and UAB opcodes. That's a good reason to keep this code around, it serves as the design document for the framebuffer and renderer itself. In other words, “it works.” It's the source of truth over what it takes to plot a pixel in the framebuffer. The accelerated versions should get a unique entry inside INT 0x18, but not serve as drop-in replacements. That is probably the best way going forward.
; ============================================================================
; AH=01h - Plot MODE 3 Pixel (4bpp packed nibbles)
; ============================================================================
; Input: X = x coordinate (mode dependant)
; Y = y coordinate (mode dependant)
; C = color (0-15)
; Output: CF = 0 on success, 1 if out of bounds
; ============================================================================
int18_plot_pixel_4bpp:
PUSHA
; Bounds check using SCREEN_WIDTH / SCREEN_HEIGHT
LDA [@SCREEN_WIDTH]
CMP X, A
JC @plot4_error
LDA [@SCREEN_HEIGHT]
CMP Y, A
JC @plot4_error
PUSH X ; save x
PUSH C ; save color
LDA [@SCREEN_WIDTH]
SHR A ; A = stride (width / 2)
MOV B, A ; B = stride
MOV A, Y ; A = y
MUL A, B ; A = y * stride
POP C ; restore color
POP B ; restore x into B
PUSH B ; save x for nibble check
MOV T, B
SHR T
ADD A, T ; A = byte offset
MOV J, A
LDI #2
POP A ; A = x
LDTL $01
AND AL, TL
CMP AL, $00
JNZ @plot4_odd
plot4_even:
LDBL [I:J]
MOV AL, BL
UAB ; AL = odd pixel, BL = even pixel
MOV BL, CL ; BL = new color
PAB ; AL = (new color << 4) | odd pixel
MOV BL, AL
STBL [I:J]
POPA
CLC
RET
plot4_odd:
LDBL [I:J]
MOV AL, BL
UAB ; AL = odd pixel, BL = even pixel
MOV AL, CL ; AL = new color
PAB ; AL = (even pixel << 4) | new color
MOV BL, AL
STBL [I:J]
POPA
CLC
RET
plot4_error:
POPA
SEC
RET
