Tuesday, September 15, 2015

8 bit machine

I have been making a 8 bit emulator for a non existent machine.   The specs of the machine are

Processor8-bit AVR
Intruction SetArduino Compatible
Program Memory128k (0000-FFFF x 16-bit)
RAM64k (0000-FFFF x 8-bit)
Display480x360 / 240/180

There is a 512x392 24bit FrameBuffer that exists as a necessity for rendering onto a canvas.  The CPU does not have direct access to the buffer and cannot read from it at all.

If I were writing an emulator that was closer to 8-bit machines of years past, There would be no framebuffer. In those days Pixels were all about timing. You coloured the pixel as the video beam went past. Video chips (or sometimes the CPU itself) had do construct the image line-by-line. Few systems had enough memory to hold a completed grid of pixels ready to go.

The transience of the display was also an advantage for those old machines. By changing the video hardware parameters during the screen update you can achieve a variety of effects. Display hardware that could show a limited number of sprites could have more appear on screen than there actually were, because you can shift them around after the video beam has passed in order to be seen by the beam a second time later down the screen. You could bend and colour things on the screen using similar so-called raster effects. What was an advantage for the old is however a disadvantage for the new. Emulators that strive for accurate emulation of old video hardware must put in a great deal of extra work to measure the timing and to generate an image from an emulated video beam on an emulated video display. Usually emulators will have to do this to fill a FrameBuffer image that will be placed on-screen by the operating system as a single bulk operation.

I decided to make the FrameBuffer a defined part of this emulator in order to reduce the workload of the emulator, It means the raster effects of old video displays will not be possible but in return there will be some advantages for the machine where it can construct a frame at a time instead of as a series of sequential pixels.

The FrameBuffer that the emulator uses is 802,816 bytes. In its raw form it would be cumbersome and slow for an 8 bit processor to write to. The CPU interface to the FrameBuffer aims to allow as much as possible for as little data and processing as possible.

There are 16 I/O ports associated with the video output ( Currently based at I/O port 0x20, may change ).

+ 0x00 pixelData_diaplayStart_L
+ 0x01 pixelData_displayStart_H16-bit address
+ 0x02 colorData_displayStart_L
+ 0x03 colorData_displayStart_H16-bit address
+ 0x04 pixeldata_increment
+ 0x05 colorData_increment
+ 0x06 pixelData_lineIncrementvalue << 3 to make 11-bits
+ 0x07 colorData_lineIncrementvalue << 3 to make 11-bits
+ 0x08 displayShift XY4-bits X, 4-bits Y
+ 0x09 serialPixel_address_LFrameBuffer pixel number
+ 0x0A serialPixel_address_M
+ 0x0B serialPixel_address_H24 bit address
+ 0x0C serialPixel_setpixel=palette[v], serialPixelAddress+=1
+ 0x0D serialPixel_mulpixel.rgb=(pixel.rgb*palette[v].rgb) >> 8 (does not advance address)
+ 0x0E serialPixel_addpixel.rgb+=palette[v].rgb, serialPixelAddress+=1 (clamped add)
+ 0x0F drawFrameBuffer0=lowRes, 1=hires, 0x10=Mode 0

The CPU controls when the virtual Framebuffer is transferred to the screen. Writing a zero to port drawFrameBuffer will show the lowres portion of the framebuffer (pixel doubled). writing a one will show a 480x360 portion of the framebuffer. In the framebuffer to screen transfer the top left corner of the source can be offset by the displayShift register. 4-bits of x and 4-bits of y. This allows for 16 pixels of horizontal and vertical 'hardware' scrolling.

Writing values 0x10 or higher will fill the framebuffer with data from CPU ram using different display modes. So far I have only defined one mode. A 1.777 bit per pixel (9 pixels per 2 bytes) as per many old 8-bit systems colour data and pixel attribute data can be treated differently and the starting address for reading either can be adjusted, Adding individual increment and line increment allows the structure of the data to be arranged and optionally interleaved to assist the task at hand.

To show you my thinking on this design, I'll first go into a quick overview of how things used to be done. Display modes on old 8 bit computers varied quite a great deal. A large part of their design was structured around how to get the most expressive images from the fewest bytes. It would have been less work for video hardware to read a single pixel and output it to the screen and repeat the process for every pixel on-screen. Unfortunately, the cost of the ram would have been prohibitive and the CPUs of the day would have been far too slow to move that amount of memory around anyway. The simplest alternative was the bit-per-pixel display where memory was loaded into a register and then shifted off one bit at a time to make pixels. But that only got you on or off pixels. You didn't have colour. There were many ways to add colour for just a little more data, The ZX-Spectrum took the extremely simple approach of sharing the same colour value for 64 pixels. An 8x8 block of pixels then used 8 bytes for the pixels and an additional byte for the colour (averaging 1.125 bits per pixel) . It got you colour but it also got you attribute clash. Other systems fared better than the spectrum largely due to the presence of sprites. Sprites were rendered onto the display independently of the main screen data so did not have to worry about attributes, They conserved memory from the simple fact that they were not very large. The C-64 and some other systems let you double the pixels of your sprites essentially allowing them to cover more space for zero cost in storing or fetching memory. Sprites enabled consoles to go even further, They eschewed even the bit per pixel approach and embraced tiled displays. This was extending the concept of colour attributes back to the pixels themselves. You could only change the screen data a cell at a time. This makes it very hard to do independently moving objects, but that task was almost completely handed over to sprites. The data savings gained by reducing the flexibility was spent on adding more colours.

The tl;dr of all that is This: If you can make a better looking image at the cost of being able to manipulate it, that's not so bad if you can draw on top of it afterwards.

For Mode-0 in my emulator I use two bytes per 3x3 cell. One byte is colour data. Colour A gets 4 bits, Colour B gets 4 bits. A and B are colours in the first 16 entries of the palette. The other byte is the pixel data. Top left pixel is always colour A, the other 8 pixels in the cell correspond to A or B depending on if the Bit in the pixel data is a zero or a one. All pixels are individually modifiable. Flipping a bit in the pixel data can change a single pixel on 8 of the 9. The top-left is a special case where it can be toggled by swapping A and B and Flipping all the pixel data bits. This gives a 1.77 bit per pixel display with the limitations that only 2 colours may appear in any 3x3 cell. It uses slightly more data than the ZX-Spectrum approach but has significantly less potential for attribute clash. It has the additional cost of being rather awkward to address. Which is why we draw things on top...
Instead of implementing sprites as a structured display hardware sprite system, The existence of a FrameBuffer allows the CPU to provide a sprite-like layer in a more direct manner. The display interface has an address for writing to pixels directly. In addition to being able to set individual pixels in this manner it also allows a limited amount of blending. Writing a byte of data to the serialPixel_set, serialPixel_mul, and serialPixel_add triggers a set, multiply or add using a palette colour. To a sprite where some pixels are transparent you can set the non-transparent pixels and add 0 for the transparent pixels (to advance the pixel write address). If you consider palette colours as being premultiplied alpha, You can use a combination of multiply and add to render variable levels of transparency, You can theoretically write the entire screen with this process, but the CPU cost would be substantial. Using it as a sprite-style solution means you can gain flexibility at a few places where you most need it.

Pixel values written to set, mul, and add ports are taken from a fixed 256 colour palette. The first 16 colours are from Arne's 16 colour palette. The rest of the palette contains a greyscale and a 6x6x6 colour cube where the gradients of the red green and blue components have been shifted to more equally match brightness (to avoid a preponderance of dark blue etc.)

I'll probably add a few more varied screen modes soon, Next up should be mouse, keyboard and time though. Then I can make some games in it.

This is the emulator as it stands right now. 

It still lacks a few Instructions, and undoubtedly has minor bugs in some of the implemented ones, but it is running my sample program ok.