sd:one_instruction_per_cycle
Differences
This shows you the differences between two versions of the page.
| sd:one_instruction_per_cycle [2026/03/11 07:48] – created appledog | sd:one_instruction_per_cycle [2026/03/12 05:24] (current) – appledog | ||
|---|---|---|---|
| Line 81: | Line 81: | ||
| * Remember which operations set flags so you don't waste time on a CMP. | * Remember which operations set flags so you don't waste time on a CMP. | ||
| + | |||
| + | == Oh help, i'm running out of registers! | ||
| + | There is a thing where it's possible to get confused over register pairing and run out of registers for your convention. In this case there' | ||
| + | |||
| + | LDA $500 ; loads $500 into A. | ||
| + | |||
| + | The above compiles to: 00 00 05 00. Thats four bytes that need to be read in. Lets say then that the cost of this instruction is to load four bytes from WASM memory. That is the essential cost; the CPU fetch-execute cycle itself has a fixed cost, so we can ignore it or add some constant. | ||
| + | |||
| + | Now consider: | ||
| + | |||
| + | MOV A, B | ||
| + | |||
| + | This costs three bytes to fetch and execute; one byte for the opcode and one for each register. What about from memory? | ||
| + | |||
| + | LDA [$000000] ; This is a 5 byte instruction. | ||
| | | ||
| + | At five bytes, it's 30-40% slower, on average, than LDA $500. But for scratch values, that are infrequently used, and not in a hot path, this frees up registers. In other words, feel free to use memory! | ||
| + | Now, if you consider the number of operations and bytes a simple push/pop would use (to juggle a register) maybe it's better to keep certain kinds of pointer or scratch value in memory versus in a register? It would need to be done with consideration, | ||
| | | ||
sd/one_instruction_per_cycle.txt · Last modified: by appledog
