sd:about
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| sd:about [2026/03/20 11:37] – appledog | sd:about [2026/03/20 12:15] (current) – removed appledog | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | //= About | ||
| - | |||
| - | In 1981, the C64 was not trying to be a C64, it was just trying to be a computer. Today, the C64 is trying to be a C64; it's a simulation -- an emulator. The C64 Ultimate is a simulator running on an AMD Xilinx Artix-7 FPGA. It's an official Commodore product, and it does a //very// good job of emulating a C64. If no one said anything, no one would know the difference. | ||
| - | |||
| - | The SD-8516 is also an emulator. An emulator for a computer that never existed. It's not trying to be a 6502, a Z80 or a 68000. It's just trying to be a computer. | ||
| - | |||
| - | //" | ||
| - | |||
| - | == 1975: MOS-6502 | ||
| - | The MOS Technology 6502, released in 1975, is arguably the most influential 8-bit microprocessor ever created. Designed by Chuck Peddle and a small ex-Motorola team, it was deliberately engineered to be dirt-cheap (volume price ~$25 vs. $179 for the Intel 8080) while still delivering excellent performance. This radical cost breakthrough, | ||
| - | |||
| - | Core specs: True 8-bit CPU with 16-bit addressing (64 KB), only three general-purpose registers (A, X, Y), a dedicated 256-byte stack page, and exceptionally powerful zero-page indexed addressing modes that effectively gave it “extra registers.” Typical clocks were 1–2 MHz **//*(up to 4 MHz in later binned parts),//** with a tiny transistor count (~3,500) and superb interrupt handling. | ||
| - | |||
| - | The SD-8516 is the modern spiritual successor built specifically for the 6502/C64 vibe but taken to the next level. It’s a clean-slate 16-bit load/store architecture (16 × 16-bit general-purpose registers, 24-bit addressing) running at a locked 4 MHz -- exactly 4x a stock Commodore 64 (2x a C128). While not a direct silicon descendant of the 6502, it deliberately captures the same “hackable, | ||
| - | |||
| - | === 1975: DEC PDP-11/70 | ||
| - | The DEC PDP-11/70 (1975) was a high-performance 16-bit minicomputer with a ~5 MHz (300ns-333ns) microcycle CPU, roughly 3x–4x faster than a PDP-11/34A. It utilized a 2 KB cache, a fast 32-bit memory path supporting up to 4 MB of memory, and dedicated Massbus controllers, | ||
| - | |||
| - | The SD-8516 at 4 MHz equivalent, with its register-rich architecture and eight-cycle MUL, lands squarely in early-80s minicomputer territory. A PDP-11/70 cost $350,000 in 1975. Today, the SD-8516 matches it as a single-board microcomputer. That's the same revolution the 68000 represented; | ||
| - | |||
| - | === 1976: Z80 | ||
| - | The Zilog Z80, launched in 1976 by Federico Faggin (one of the Intel 4004/8080 designers), was designed as a superset of the Intel 8080 with full binary compatibility plus a huge list of new instructions, | ||
| - | |||
| - | It dominated European and business microcomputing and remains one of the longest-lived 8-bit designs. | ||
| - | |||
| - | These CPUs, as well as the SD-8516, perfectly illustrate "keep it simple, make it fast and cheap, and let programmers have fun". The Z80 produced wildly different yet equally legendary platforms. Business and gaming. The SD-8516 carries that same joyful spirit into the future for a new generation of retro hackers. | ||
| - | |||
| - | === 1978: Intel 8086 | ||
| - | The Intel 8086, introduced in 1978, was designed as a stopgap. Intel' | ||
| - | |||
| - | The 8086 runs at 5-10 MHz with a 16-bit data bus, a 20-bit segmented address space (1 MB, accessed through four segment registers that shift and add to produce physical addresses), and four general-purpose 16-bit registers (AX, BX, CX, DX) that can be split into eight 8-bit halves. It also has four segment registers, a stack pointer, base pointer, and two index registers. The instruction set is notoriously irregular; certain operations can only be performed with specific registers (CX for loops, AX for multiply, DX:AX for 32-bit results). The segmented memory model, where every address is computed as segment x 16 + offset, was the defining headache of PC programming for over a decade. Hardware multiply exists but is painfully slow at 118-133 cycles for a 16×16 operation. The 8086 at 4.77 MHz achieved approximately 300 Dhrystones — roughly 0.17 DMIPS. | ||
| - | |||
| - | The 8088 variant in the IBM PC was even slower, with an 8-bit external bus that halved memory bandwidth. Most benchmark comparisons of the era showed the 8 MHz 68000 outperforming the 4.77 MHz 8088 by a factor of 5-7x on real workloads. The 8086/8088 won the market not on technical merit but on IBM's brand, the open architecture that enabled clones, and the CP/M software migration path through MS-DOS. | ||
| - | |||
| - | At nearly identical clock speeds, the SD-8516 delivers more than 5x the processing power of the 8086. The reasons are architectural, | ||
| - | |||
| - | The multiply gap is the widest of any comparison. The 8086's MUL takes 118-133 cycles -- over 25 microseconds at 4.77 MHz. The SD-8516 can complete the same operation in under eight cycles at 4 MHz. That is a 20:1 advantage per multiply. For workloads like prime number generation, encryption, or any game physics involving scaling or rotation, this single difference transforms what is computationally feasible in real time. | ||
| - | |||
| - | Forth on the 8086 was notoriously awkward. The segment registers interfere with the dual-stack model, the register specialization prevents clean assignment of dedicated stack pointers, and the 16-bit cell size limits numeric range. Estimates place 8086 Forth at roughly 15,000 words/sec at 4.77 MHz. The SD-8516' | ||
| - | |||
| - | In the broader context: the IBM PC succeeded despite the 8086, not because of it. The SD-8516 represents the road not taken; a clean, flat-addressed, | ||
| - | |||
| - | === 1979: Motorolla 68000 | ||
| - | The 68000 was the chip that brought minicomputer power to the desktop. Introduced by Motorola in 1979, it was a quantum leap over every 8-bit processor on the market. With a 32-bit internal architecture (though externally 16-bit), eight 32-bit data registers, seven 32-bit address registers, and a clean orthogonal instruction set, the 68000 was widely considered the finest microprocessor architecture of its generation. It powered the original Apple Macintosh, the Commodore Amiga, the Atari ST, the Sega Genesis, and nearly every Unix workstation of the early 1980s. | ||
| - | |||
| - | The original 68000 ran at 4 MHz with a 24-bit address bus (16 MB), hardware 16×16=32 multiply (MULU/MULS, 38-70 cycles), and hardware divide (DIVU/ | ||
| - | |||
| - | The SD-8516 and the 68000 share a remarkable number of architectural features: 24-bit addressing, dual stack pointers, pre-decrement/ | ||
| - | The SD-8516 matches the 68000' | ||
| - | |||
| - | The SD-8516' | ||
| - | |||
| - | In terms of Forth performance, | ||
| - | |||
| - | == The Secret Sauce: A Fast MUL | ||
| - | The SD-8516' | ||
| - | |||
| - | The // | ||
| - | |||
| - | But 3 to 4 cycles is nothing compared to one. //According to the lore,// Stellar Dynamics bet the bank and licensed a 128 KB mask ROM die from an unknown Japanese semiconductor partner. Hitachi, Toshiba, Sharp and many others were all doing custom ROM for game cartridges in that era. This ROM, bonded directly into the SD-8516 package, contains a complete 8 bit to 16 bit multiplication table. The MUL microcode concatenates the two 8-bit operands as a 16-bit ROM address and reads the 16-bit result in a single bus cycle. For partial products it then sums them with the internal ALU. A total of no more than 4 ROM lookups and 3 additions. Pipelined, an 8x8 mul into 16 bits could be done in //one CPU cycle,// although 16 bit and above could take up to 8 cycles. And thus, the SD-8516' | ||
| - | |||
| - | Table lookup MUL was not new, but it was typically impractical due to cost and die area. Stellar Dynamics' | ||
| - | |||
| - | It's DIV uses a similar but smaller ROM (a reciprocal table, 512 bytes) combined with a Newton-Raphson iteration step. This uses two ROM lookups and one multiply to give a quotient in the same cycle as MUL. This technique was eventually used in the AMD Am29000 and later in the Pentium' | ||
| - | |||
| - | == 1981: The SD-8516 | ||
| - | Origins: Stellar Dynamics Epsilon Containment Facility (1980-1982). | ||
| - | |||
| - | In the summer of 1980, three engineers at Stellar Dynamics were given authorization to create the Epsilon Containment facility alongside Black Mesa and Aperture Science, after the G-Man had given each a random sample from another world. The ECF received a sample of 80s protoculture known as " | ||
| - | |||
| - | Dr. Issac Korr and Dr. Vance Halberg headed the project, reporting to a Dr. Magnusberg. | ||
| - | |||
| - | The 68000 was a general-purpose marvel. Clean, orthogonal, elegant; but it made compromises for breadth. Its 16-bit external bus was a cost concession. Its 38-to-70-cycle multiply was a die area concession. Its instruction decoder, designed to handle every conceivable addressing mode with equal grace, occupied silicon that could have been spent on making common operations faster. | ||
| - | |||
| - | Korr wanted to build a chip that was unapologetically opinionated. Where the 68000 asked "what might a programmer need?", | ||
| - | |||
| - | === Design Philosophy: The Controversial Choices (1981) | ||
| - | The SD-8516 design began in earnest in January 1981 and was driven by three principles that were, at the time, considered somewhere between unorthodox and reckless. | ||
| - | |||
| - | __**First: registers are cheaper than memory cycles.**__\\ While Intel was spending transistors on segment arithmetic and Zilog was adding prefix bytes to squeeze more instructions from an aging architecture, | ||
| - | |||
| - | The composite register system emerged from Dr. Issac Korr's observation that most programs use pointers and data simultaneously, | ||
| - | |||
| - | __**Second: Two stack pointers, not one.**__\\ Every major programming paradigm — Forth, C, Pascal, even structured BASIC maintains at least two stacks: one for data, one for return addresses. Every existing processor forced programmers to share a single hardware stack pointer between both, or to dedicate a general-purpose register as a second stack pointer and manage it manually. The SD-8516 made the dual-stack model architectural, | ||
| - | |||
| - | __**Third, and most controversially: | ||
| - | |||
| - | //The Multiplication Table: Stellar Dynamics' | ||
| - | |||
| - | In 1980, an 8-bit multiply took 130 cycles on a 6502, 200 cycles on a Z80, and 70 cycles on a 68000. Even the fastest implementation — the 6809's dedicated 11-cycle MUL — could only handle 8×8 products. The SD-8516 was designed to multiply 32-bit values in a single clock cycle. | ||
| - | |||
| - | The secret was a 128 KB mask ROM die, bonded directly into the processor package, containing a complete 8x8 to 16-bit multiplication lookup table. To multiply two 8-bit values, the microcode simply concatenated them into a 16-bit ROM address and read the 16-bit result in one bus cycle. For 16-bit and 32-bit multiplies, the operation was decomposed into four 8×8 partial products using the ROM and summed by the ALU. The entire sequence was pipelined to complete in a single processor cycle. | ||
| - | |||
| - | A contributing factor was the unrolled microcode used in it's design. While considered sloppy by some, at three times the transistors as a 68000 (and almost 10x those on an 8086) the design choice paid off with the ability to overclock from 4 MHz to 16 MHZ and above. Over-engineered to a fault, some users reported LNG cooled chips running at over 300 MHz. | ||
| - | |||
| - | The 128 KB ROM coupled with a larger die size was, in 1981, outrageously expensive. By 1978 the 6502 cost $4 or $5 to make and sold for $25. The 8086 cost $10 to $20 to make; launched at $87, by the early 80s the price had fallen to under $20. The 68000 had a similar story; Launched for over $400, Steve Jobs famously negotiated a mass-purchase for around $15 per unit. | ||
| - | |||
| - | The straw that broke the camel' | ||
| - | |||
| - | Dr. Izzac Korr, who designed the multiply unit, later reflected: " | ||
| - | |||
| - | He was right. The multiply ROM was the chip's signature feature and its single greatest competitive advantage. But it was also the reason the chip nearly died at birth. | ||
| - | |||
| - | === Critical success, commercial failure. | ||
| - | The SD-8516, formally announced in March 1982 and available in sample quantities by September, offered the following specifications: | ||
| - | |||
| - | * **Architecture: | ||
| - | * **Registers: | ||
| - | * **Address space:** 24-bit flat addressing (4 banks × 64 KB = 256 KB) | ||
| - | * **Data path:** 16-bit internal, 32-bit operations via register pairing | ||
| - | * **Clock speed:** 4 MHz (initial), later 8 MHz and 16 MHz | ||
| - | * **Multiply: | ||
| - | * **Divide:** 8x8 bit 2 cycle (512-byte reciprocal ROM + Newton-Raphson) | ||
| - | * **Stack pointers:** 2 dedicated (Stack and Data Stack) | ||
| - | * **Interrupts: | ||
| - | * **Performance: | ||
| - | * **Transistor count:** ~168,000 (including multiply ROM) | ||
| - | * **Process: | ||
| - | * **Package: | ||
| - | * **Price:** $189 (initial, single unit, 1982) | ||
| - | |||
| - | === A Savior from Japan | ||
| - | The SD-8516 launched into a market that did not want it. | ||
| - | |||
| - | At $189 per unit, it cost six times more than a Z80, four times as much as a 68000, and more than double an 8086. The 128 KB multiply ROM, the chip's greatest technical achievement, | ||
| - | |||
| - | In the business and banking world, it's overstated power was largely seen as un-needed at the time. System designers who needed fast multiplication were already using the 68000 with acceptable results. Those who needed a cheap CPU bought the Z80 or 6502. The SD-8516 fell into the gap between single-user and enterprise-class time-sharing and never found it's niche. | ||
| - | |||
| - | IBM had quicky standardized the industry around the the 8088 architecture in response. The Macintosh was about to standardize creative professionals on the 68000. Stellar Dynamics had no ecosystem, no software library, and no Fortune 500 patron. The VC-3, their reference personal computer design, was technically impressive -- a complete system with banked memory, SID-inspired sound synthesis, and multiple video modes -- but it competed against machines backed by companies with million dollar marketing budgets and thousand-man sales teams. | ||
| - | |||
| - | By mid-1983, Stellar Dynamics had sold fewer than 5,000 units total. Their Japanese and Taiwanese investors were losing patience. Dr. Magnussberg reportedly mortgaged his house to make payroll in November 1983. | ||
| - | |||
| - | The turning point was when Dr. " | ||
| - | |||
| - | But Shimajiro' | ||
| - | |||
| - | |||
| - | === The Arcade Pivot (1984-1987) | ||
| - | What saved Stellar Dynamics was not the personal computer market but the one market where single-cycle multiply was not a luxury; it was a necessity. | ||
| - | |||
| - | Arcade game hardware in 1984 was undergoing a transformation. Sprite scaling, rotation, and pseudo-3D effects required real-time multiplication at rates that brought the 68000 to its knees. A single sprite rotation required dozens of multiplies per frame. At 60 frames per second, the 68000' | ||
| - | |||
| - | But the SD-8516 had solved this problem before it became an issue, and sales began to pick up. Shimajiro' | ||
| - | |||
| - | Namiko was the first major licensee, integrating the SD-8516 into a custom arcade board in late 1985. Conam co. followed in 1986. By 1987, the SD-8516 was present in over a dozen arcade platforms, and Stellar Dynamics had quietly become profitable. The chip's price dropped to under $50 as volumes increased. | ||
| - | |||
| - | === The Cult Following (1987-1989) | ||
| - | Arcade success attracted a different kind of attention. University computer science departments, | ||
| - | |||
| - | A community formed. It was small, intense, and disproportionately influential. Forth programmers -- already accustomed to being a cult -- adopted the SD-8516 as their ideal machine. The dual stack pointers, the pre-decrement/ | ||
| - | |||
| - | The BBS scene, still thriving in the early 1990s, began to adopt the VC-3 as a cult platform. Its KERNAL ROM, inspired by Commodore' | ||
| - | |||
| - | === The Business Surprise (1989-1992) | ||
| - | The SD-8516' | ||
| - | |||
| - | By 1989, the IBM PC ecosystem was drowning in its own legacy. DOS applications were bumping against the 640 KB conventional memory barrier. The 80286' | ||
| - | |||
| - | These terminals ran lean. The SD-8516' | ||
| - | |||
| - | Stellar Dynamics licensed the architecture to several different Japanese and Taiwanese manufacturers. By 1991, SD-8516-based terminals were outselling 80486-class PCs in several Asian markets. A version with expanded banking (16 banks × 64 KB = 1 MB) addressed the memory ceiling on the base model, and a line of business peripherals -- dot matrix printers, serial networking cards, and external floppy drives rounded out the ecosystem. | ||
| - | |||
| - | The installed base, combining arcade hardware, university systems, BBS machines, and business terminals, reached an estimated 2.4 million units by 1992. Stellar Dynamics, the company that nearly died in 1983, was valued at... //one **million** dollars.// | ||
| - | |||
| - | === The Silver Age (1990-1995) | ||
| - | What happened next was unprecedented and, in retrospect, slightly insane. | ||
| - | |||
| - | The SD-8516' | ||
| - | |||
| - | Programmers who had grown up on Commodore 64s and Apple IIs -- the generation that learned to code by poking bytes and timing raster interrupts found in the SD-8516 a machine that respected their skills. You could understand the entire system. You could hold the memory map in your head. You could write to the framebuffer directly and hear the results from the sound chip immediately. The VC-3, with its 320×200 graphics modes, SID-inspired 8-voice synthesizer, | ||
| - | |||
| - | The demoscene adopted it. The indie game scene adopted it. A small but vocal contingent of programmers argued, not entirely without merit, that the SD-8516 represented the last generation of computers that a single person could fully comprehend. | ||
| - | |||
| - | This philosophy that a computer should be knowable, that complexity has costs beyond performance delayed the adoption of 3D acceleration in the SD-8516 ecosystem by nearly a decade. While the PC world raced toward texture-mapped polygons, dedicated GPU pipelines, and hardware T&L units, SD-8516 developers perfected the art of software rendering. Mode 7-style floor effects, raycasting engines, and Bresenham line drawing techniques that PC developers abandoned the moment they had GPU hardware were refined to extraordinary levels on the SD-8516. | ||
| - | |||
| - | The results were, by any objective measure, technically inferior to what a Pentium with a Voodoo card could produce. They were also, by a measure that resists quantification, | ||
| - | |||
| - | === The Sunset Era (1995-present) | ||
| - | The SD-8516 did not survive the 2000s as a commercial platform. It couldn' | ||
| - | |||
| - | The cult never died. It just went online. | ||
| - | |||
| - | Emulators appeared in the early 2000s. Browser-based implementations followed, running the SD-8516 in WebAssembly at speeds the original hardware could only dream of. The community, small but persistent, continued to write FORTH interpreters, | ||
| - | |||
| - | The SD-8516' | ||
| - | |||
| - | Every few years, someone on a retro-computing forum asks: "What if IBM had chosen the SD-8516 instead of the 8088?" The question is unanswerable and irresistible. The SD-8516 was never going to win. It was too expensive, too opinionated, | ||
sd/about.1774006670.txt.gz · Last modified: by appledog
