Friday, April 13, 2018

Giving the Kwak8 256 colours.

I am working on a 8-bit fantasy console called the Kwak-8. Inspired by the Pico-8 it provides a platform for people who want a more low-level coding experience

For the most part the Kwak-8 can be thought of as a 16 colour machine. The display generation modes and the blitter are both based around a fixed 16 colour palette. You can achieve more colours using the serial pixel writing port. Writing a byte to the port uses a colour from an 8 bit palette. This gives you more colours but at the cost of having to use the CPU to build up the image pixel by pixel. It is something to be used sparingly and/or cleverly.

Since the 8 bit palette will be not be changeable. I wanted to pick the best 256 colour palette I could find. As a general purpose palette it has to offer a bit of everything. I could find surprisingly few palettes in this category.

First up, let's look at the VGA palette.
The VGA pallete colours. Mouse-over the 3D view to see Lab space.

The reason we start here is pretty much to dismiss it out of hand. As general purpose palettes go it's laughably bad. From a technical standpoint it has given far too many entries to cover hues, and not enough for brightness and saturation. From a non-technical standpoint, just look at it! It's just hideous. The Base 16 colours, a greyscale then three slow trips around the colour wheel for Vivid, Murky and Murkier.

Let's not use the VGA palette.

Next up is the most obvious, naive thing to try. Make the largest colour cube you can fit into 256 entries. 6x6x6 comes to 216. So you can have a colour cube with six shades of red, green, and blue.

That's actually a pretty good range, for any arbitrary colour there will be something at least in the ballpark. This seems to be why there aren't many different 256 colour general purpose palettes. This is fairly good and consequently many general purpose palettes incorporate these 216 colours. Web Safe colours use exactly these values

Beyond this point, things got much harder to find. I encountered a nice and logical enhancement at This suggests a palette with the Windows fixed system colours, a grayscale and a gamma adjusted colour cube with the corners trimmed off. You don't lose colours trimming the corners off the colour cube because those colours already exist within the fixed Windows entries.

The other interesting candidate was also a gamma adjusted colour cube, but one with a clever twist. The palette can be generated from the following javascript

 function gammaRamp(count,gamma=1.5) {
  for (let i=0; i<count; i++) {
   result.push(Math.round( Math.pow(1/(count-1)*i,1/gamma)*255));
  return result;

 function make8LevelCube(gamma=1.5) {
  let  levels = gammaRamp(8,gamma);
  let result = [];
  for (let i = 0; i<256;i++) {
   let gi = (i>>2) & 7;
   let bi = (i>>5) & 7;
   let ri = (i<<1) & 7 |  (gi&1) ^ (bi&1) ;
   result.push( [levels[ri],levels[gi],levels[bi]]);
  return result;

This makes a 8x8x8 colour cube where the least significant bit of the red intensity is the XOR of the least significant bits of the other two primaries. You get 8 primary intensity levels but at the cost of having to add just a hint of another primary colour to the mix.

It is certainly a very orderly looking palette. It's a bit hard to see that there are 8 levels of red in this arrangement, but each set of 4 reds is slightly offset from it's neigbours.

Just by looking at the palette tables, it is very hard to assess how well they represent the full range of colours. To act as a more useful guide I converted them into cylinders of hue, saturation and luminance. Each slice of the cylinder has the same saturation. I matched each point with the closest colour in the palette using both, RGB distance and the CIEDE2000 colour difference.


If you drag the saturation slider up and down you can see the relative strength and weaknesses of the palettes. The 8x8x8 colour cube actually does a very good job in general, but when you get down to grays, the lack of a dedicated grayscale shows up. The colour shift of the bit twiddling limits the number of pure grays causing it to be worse than even the 6x6x6 colour cube. The weak point of the Gamma adjusted system palette is when things are slightly off gray, low saturation colours are much better represented by the 8x8x8 palette.

If you wanted to display colour video on a 256 colour display, the 8x8x8 would probably be a very good pick, Pure gray seldom occurs naturally and it has plenty of colours in the general neigbourhood of grey.

One thing I do notice about the palettes is they all seem to have colours that are nearly indistinguishable in some areas and yet neighbouring colours in other areas have quite dramatic changes. This is to be expected with colour cubes, human eyes do not see all hues equally. The MacAdam Ellipse demonstrates the problem nicely

One person's equal perception Ellipses (scaled 10 times larger than measured)

So our ideal palette probably has fewer greens and more purples, but it looks like it's going to be a bigger job than just shuffling around a few colours. So how does one generate an evenly spaced palette in the wibbly wobbly space of human perception?

I'm a fan of evolving solutions. Ever since I saw the original Evolving Mona Lisa post, I have been trying variations of this technique to problems. I did my own variant of the Image evolving where I evaluated it as a compression technique, with better results than I imagined. More recently I tried a version using a shader and an odd sort of distance field. It did pretty well.

It takes a long time, but as long as you have a way to measure the quality of a result you can make random changes and keep the changes that make things better. The theoretical ideal general purpose palette that we are after is one where any arbitrary colour we choose is represented by a colour in the palette which is as similar(to a person) as possible. The CIEDE2000 colour difference function provides a good basis for judging how good such a palette is.

I won't bore you with all of the details of the many things I tried. I shuffled colours around a lot and tried a huge variety of fitness functions. Ultimately a mix of fitness functions at different stages of evolution helped. I suspect I could keep on tweaking the code for years to get it better, but I had to stop somewhere.

Once I had a set of colours to work with, They were essentially randomly ordered. I wanted to put them into some form of useful order. Observing the earlier palettes, I noted it is actually quite hard to find the colour you are looking for when it is surrounded by colours that are significantly different. I developed a plan of attack. Assuming a prospective pixel artist was looking at a 16x16 table of colours, minimize the distance between neighbouring colours. Once again that sounds suspiciously like a fitness function, so evolution solves the problem without having to think too hard.

The actual fitness function I ended up using focused only on the palette entries 32...255. The first 32 colours are the Arne16 palette and a greyscale. I left it running for a fairly long time and the end result came out rather nice.

I think a prospective artist would be able to get a feel for the structure of the palette and know where to look for the colour they were wanting to use.

Now, let's take a look at the palettes and see how it compares with the others.

First up, greyscale;

ColourCube 8, redSwizzled
Websafe palette
Gamma adjusted System Palette

It is no surprise that the palettes with dedicated grey ranges perform better here. On my display, the Evolved palette is more balanced with The Gamma Adjusted System biasing too strongly towards the lighter colours. I'm not sure how others will see it. This is about the only time the VGA palette is even remotely comparable.

Once we add a little bit of colour the story changes considerably.

ColourCube 8, redSwizzled
Websafe palette
Gamma adjusted System Palette

At this level the Evolved palette seems to be the winner, The redSwizzled colour cube and the Gamma adjusted system palette don't do too badly though, At this point we'll drop the VGA palette, and the websafe palette so we just have the three to look at.

ColourCube 8, redSwizzled
Gamma adjusted System Palette

At this level it becomes more subjective, I can't really call which is better. I can note differences The Evolved palette has fewer colours around the cyan to green and green to yellow parts of the wheel. On the other hand it has more oranges and browns in the yellow to red range.

The redSwizzled palette has a fairly detailed range from magenta to black and purple to black. The Evolved palette seems to better represent the darker colours compared to the other two.

ColourCube 8, redSwizzled
Gamma adjusted System Palette

At full saturation The story is similar. The first two are better around the magenta area. The evolved is weaker on green hues but fades off to black a bit more smoothly.

Ultimately the evolved palette does more or less what I asked of it. It resulted in a palette with colours more evenly distributed. If I were to manually tweak the palette I might add another hue to the yellow/green area, but I'm not sure where I would take the colour from. There are certainly many other little improvements that could be made, but for now this will have to do me.

I am open to receiving suggestions and updates to this palette. I won't need to freeze the palette of the Kwak-8 system until someone actually releases a program that relies on it.

Next up for my Kwak-8 development is an in-browser assembler, so that people can dive in and make little bare-(emulated)-metal Kwak-8 programs without having to install a development environment. The Assembler already works, I just need to tidy up the environment a bit and add some examples. Stay tuned.

Finally, here's a view of all of the palettes so you can compare them at your leisure.


Thursday, December 14, 2017

Designing Audio Registers for the Kwak-8

When I turned my mind towards adding Audio to the Kwak-8 fantasy console, I took a look at some of the hardware capabilities of old 8 bit machines. The 8-bit computer most Accaimed for its Audio capabilities was the Commodore 64, with its SID chip. I didn't want to make an exact copy of the SID for a number of reasons, I wanted the Kwak-8 to have its own unique sound, and ideally not too much difficulty in playing a variety of sound effects.

In the end I decided to have 8 bytes to describe the complete sound of a single voice output. I gave the Kwak-8 eight voices sharing the same output ports with a voice select register. So, having decided to have 8 bytes for each voice, the question then becomes how do I allocate those 64 bits to features.

The easy bits

  • Frequency - 16 bits
  • This is an important one so gets a lot of bits. I went for the same scaling as an NTSC SID at register_value*0.0596Hz

  • Volume - 8 bits
  • 256 levels should be plenty for fading in and out of sound effects

The WaveForm

Four commonly used simple waveforms turn up a lot in sound synthesis, Sine, Square, Triangle and SawTooth. I decided to provide the first three of these

Square Wave
Sine Wave
Triangle Wave
I also wanted to provide some intermediate levels bewteen these "primary" sounds, allocating 4 bits gave me a range of 16 values where I placed Square at 0000, Sine at 1000 and Triangle at 1111.

initially I did a straight linear interpolation between thw wave outputs.

This worked fine on the Sine to Triangle transition but had a couple of problems on the Square wave. The first issue was fairly obvious in hindsight. The square wave has more area under the curve. The wave displaces more air, It's louder because it literally has more volume. That's an easy fix, just scale it down to be about the same as a sine wave.

Here's how it looks with scaling.

That's better, but when I listened to the generated sound, it didn't seem to be a very even interpolation. If you think about an interpolated value you can imagine it being a comprimise. A square wave isn't very compromising, it's either up or down. You can see in the interpolation that straight edges turn up pretty much as soon as any amount of the square wave is added.

To fix this I needed something that transitions to a square wave by increasing the slope instead of adding sharp transitions of increaing size. A sigmoid function looked like a good candidate. using a function of y = 1 / ( 1 + Math.exp(x * slopeFactor)) gives a controllable slope.

Wrangling a couple of these curves shifted and scaled makes for something that approximates a sine curve at one end and a square wave at the other.

Finally, replacing the original square wave with this new function gives us

Tweaking The Wave

To make a wider variety of sounds (wikipedia tells me I'm talking about timbre here, I really have no idea what I'm doing). I provided a way to adjust the shape of the wave. Applying a kind of gamma curve to the time domain can make the waveform front heavy or back heavy.

Looking at the wave, I noticed that it takes quite a sharp turn when it starts a new phase. I'm not entirely sure if that's a bad thing, but I decided to try a few things to make the wave look more aesthetically pleasing and see how they sound. There isn't really a right or wrong here, as long as it sounds ok.

I tried applying a function that weighted the tips of the gamma curve towards linear. gamma curve linear. This produced a rather unexpected (to me anyway) result of making the curve bend back down in places, given this is in the time domain that means it ends up playing part of the waveform backwards. Nevertheless it does sound interesting and stands a chance of staying in the mix.

I wanted to try a sigmoid wave modifier, but the sigmoid I had been using wasn't really a good pick. I wanted something that would transition from 0 to 1 and be configurable to bend in each direction. After much googling I came across a gem of a blogpost Normalized tunable sigmoid functions which was exactly the sort of thing I was looking for. That other sigmoid function I used above? Rubbish! I have a new shiny one now.

If you take away just one thing from this post, making a function like this is a good choice.

      * Genrate a sigmoid function for a given weight
      * param (number) weight - in the range -1 to +1 controls the steepness of the bend, 
      *                         weight=0 is linear. 
      *                         less than zero pulls the center towards vertical
      *                         greater than zero pulls the center towards horizontal
      function makeSigmoidFunction (weight=0.5, low=0, high=1) {
        let range = high-low;
        if (weight==0) return a=>a*range+low;
        let w = 0.4999/weight-0.5; //just a tad below |1| at the ends
        let mid = (high+low)/2;
        let xlate = a=>Math.min(1,Math.max(-1,(a*2-1)));   
        return a => (xlate(a) * w / (w - Math.abs(xlate(a)) +1) ) / 2 * range + mid;
Applying the sigmoid to the time domain makes soemthing that looks interesting, Listening to the changes in the sound, it seems like not too bad as a pick, I'm not entirely sure whether or not a symetrical waveform tilted to the left sounds different to one tilted to the right. If it turns out the top bit is redundant I might appropriate it for something else at a later date.

After all that we've managed to allocate annother entire byte for defining the general shape of the wave
  • Wave Type - 4 bits
  • Square, sine and triangle with interpolations

  • Wave Shift - 4 Bits
  • Apply sigmoid to time domain

That makes 256 different basic wave sounds. One of the motivations of breaking the waveform into two parameters like this was to allow small changes in each parameter to be small changes in the waveform itself. That should enable people to home in to the sound that they are after, rather than get lost in a 'hunting for the right font' style scenario.

Pitch bend

This part was much easier to decide upon. The bend value is just a sine wave with phase and frequency control. The result of the bend output changes the frequency of the playing voice. A low frequency bend can cause a subtle change in pitch and usually will only cover a part of the sine wave. The phase lets you choose whether to start by increasing in pitch or lowering in pitch. At high frequencies it causes a siren and higher still a vibrato.
4 seconds

So that's another two bytes:
  • bend phase - 3 bits
  • indicating the position in the sine wave to start the bend

  • bend Frequency - 5 bits
  • setting the frequency to Math.pos((value+1)/33,2)*30; ranging from a minimum frequency of 0.02Hz to a maximum of 28Hz

  • bend amplitude - 8 bits
  • How much bend to apply. Zero is no bend at all. This gets a full 8 bits so that you can fade the bend effect in and out.

The Envelope (and a little noise)

I decided against the classic Attack/Decay/Sustain/Release envelope which is more designed for musical keyboards. The Sustain level assumes an active participant holding the note, If a CPU has to keep paying attention to the note to hold it, it can dynamically decide on any envelope by adjusting the voice volume directly.

For an easier fire-and-forget model I went for a simple Attack/Hold/Release envelope. Three durations are specified for fade-in, hold, fade-out. Instead of a Decay phase I elected to use the remaining bits to apply variable levels of noise to the voice.

Thus the last two bytes of the voice registers are:
  • Attack - 4 bits
  • Duration of fade up to full volume. (Value / 8)² Seconds. Max: 3.5 seconds

  • Release - 4 bits
  • Duration of fade out after hold duration ends. (Value / 8)² Seconds. Max: 3.5 seconds

  • Hold - 4 bits
  • How long to hold the sound. (Value / 8)² Seconds. Max: 3.5 seconds

  • Noise - 4 bits
  • random noise mixed into the sample 0=no noise. 15=all noise

I placed the Attack and Release values into the last of the 8 byte values and made it a trigger on write port so that a new Attack/Hold/Release cycle occurs when writing to the port. For playing a sound effect a coder can write all 8 bytes in order and the sound should just play.

Putting it together

Finally came the task of placing it into the Kwak-8 emulator. This was a surprisingly easy task. During the design phase I created a very simple tester page. Have a play around with it if you like This let me see and hear the result as I tweaked things. By the time I needed to add it to the emulator I had a working template. The most awkward part was having to convert arrow functions to long form. The Emulator is written in haxe, arrow functions are coming to Haxe ( 4.0.0 preview builds have them ), they just aren't in the release I'm using right now.

And finally, finally. I had to make a program for the Kwak-8 that actually played sound.

The Emulator is on It has an Audio Test program where you can adjust the voice registers in-emulator, and play on a little on-screen keyboard.

Go and have a play with it, It's kinda fun.

Monday, December 11, 2017

Emulator has a Name and Programming languages - The Kwak-8

My fantasy console emulator has a name, a github project, and a webpage.

Respectively those are

I would have liked to bumped the clock speed to 16Mhz to be more familiar to Arduino coders, but it would be hard to push the JavaScript emulator to those speeds. At 8Mhz it runs nicely on my Arm (rk3288) Chromebook. The awesome blitter should more than compensate. I haven't added a clock cycle cost to blits so to the VM they are instant. Obviously the emulator takes time but as long as people don't get silly it shouldn't slow down the emulator. You just know someone will get silly though.

The emulator has successfully run programs built ASM, C and FreePascal. So it's getting into the realms of usable.

I have added a NES style tiled graphics mode which allows x and y flips, so it very much behaves like the blitter only with a big indexed batch. Still weighing up whether or not to have the bitter do Transpose X/Y. Effectively a 90 degree rotate.

Other Kwak-8 bits in the works.

  • An assembler, written in JavaScript, so people will be able to do some low level coding from within the browser.
  • Audio Registers. Design is complete. 8 voices, which is a little extravagant, but sounding suitably 8-bit bleepy. 8 bytes of parameters encodes frequency,envelope,wave, noise, bend etc.
  • The Extended Palette. In serial Pixel mode you can display 256 colours (more if you are clever ;-). I have been working on making a nice balanced palette
More soon.

Sunday, May 28, 2017

Making a (really simple) game to Learn rust.

So there was a a post on hacker news On making a game in Rust.    In the comments I mentioned it would be nice to have a simple game example to enable people to learn Rust by extending the example.

Rather than sit and hope that such an example turns up I thought I should have a go myself.   The problem with that is, of course, that I don't have a very good grasp on Rust myself.   I wanted the example to give myself a base to learn from.   Nevertheless I'm diving in and we'll see how it ends up.
Bored already? Just download the example and dive in.

In the responses to my comment someone pointed me to  rust_minifb
which does look like a good place to start.

I got my ChromeBook with Crouton and ran rustup to get me up to date and made a tiny program using rust_minfb.   It didn't work.  A somewhat inauspicious start. 

My code, as little as there was of it, appeared to be fine.   The error was in building rust_minifb.  

"can't find crate x11_dl"  cargo wailed.

But wait! Here it is

The detail in which the devil was concealing itself was the altitude of a single character.

rust_minifb wanted x11_dl
cargo had  x11-dl (note the dash instead of underscore

Had I found a bug in rust_minifb?  I asked on #rust-beginners.   A helpful resident tried

git clone
cd rust_minifb
cargo run --example noise

and reported that the noise example worked fine for them. 

I tried the same procedure on my ChromeBook and was rewarded with the familiar "can't find crate x11_dl"

At this stage I did what any reasonable person in this position would do.  I moved to my desktop machine.  cargo run --example noise worked, as did my tiny test program.  I could put pixels onscreen. Hooray
rust_minifb also gives us some input. The Window struct provides methods such as get_mouse_pos and is_key_down. We can read input and we can put something on screen, that's two of the main requirements for a game. We still need a sense of time. Turns out Rust makes this bit quite easy.

    let (ticker_tx,ticker_rx) = channel();

    thread::spawn(move|| {
        loop {
That sets up a thread that that repeatedly sleeps for a bit and sends a tick. There are more accurate ways to do this if we want to be in sync with the video frame. but for our purposes, this is good enough. It stops things from running too fast and using all of the CPU. Most importantly, it's nice and simple.
So now we have most of the operating system necessities done. We can get our bunch of u32RGB pixels onscreen, We only need to make those pixels contain what we want to see. That's something we can do purely from within Rust without having to figure out how to talk to the operating system or other APIs. Hardware accelerated rendering would need that, but we're going to just use good-old-fashioned CPU power. The idea is to learn Rust first. Once you have that under your belt you can go and fight in the API wars safe in the knowledge that your Rust is perfect and any further problems surely reside in the OpenGL driver.

What we need are some drawing commands. This is something Rusts traits are good for. We can have a trait that contains drawing functions and then use them on anything that implements the trait. For starters lets have

trait PixelCanvas {
  fn clear(&mut self );  
  fn plot_pixel(&mut self, x:i32, y:i32, color:u32);
  fn draw_line(&mut self, x1:i32, y1:i32, x2: i32, y2:i32, color:u32);
  fn draw_horizontal_line(&mut self, x1:i32, x2:i32, y:i32,color: u32);
  fn draw_rectangle(&mut self, left:i32, top:i32, width:i32, height:i32, color: u32);
  fn fill_rectangle(&mut self, left:i32, top:i32, width:i32, height:i32 ,color: u32);
  fn draw_circle(&mut self, x:i32, y:i32, radius:u32,color: u32);
  fn fill_circle(&mut self, x:i32, y:i32, radius:u32,color: u32);
That lets us clear the screen and put a few things on it. These are very much the sort of thing that you'd see in any programming language. Some languages have the self parameter implied. The only distinctly Rust-ish thing about the trait is the &mut self. &mut self means "self refers to a thing that this function might change". That's rather understandable. my_picture.draw_line(10,10,20,20,mycolor) should be expected to change my_picture, that's the entire point of a line drawing function.

Traits can also have default implementations. If you include a function body in the trait definition, it will use that if the trait implementer does not provide an implementation of its own. For the PixelCanvas trait I wrote the most simple naive implementations of drawing functions that use features of the trait itself. So draw_line splits the recursively splits the line in 2 until it is left with a pixel length line then uses plot_pixel, fill_rectangle and fill_circle draw themselves using horizontal_line which in turn uses plot_pixel. Ultimately that means any implementer wanting to use the drawing functions need only implement plot_pixel.

That may not sound like an efficient solution, and it really isn't, but It will work. If you drew a 1000x1000 filled rectangle it would generate a thousand draw_horizontal_line calls which would each in turn call a thousand plot_pixel calls, This would result in a million pixels begin individually clipped and drawn. It's really easy to improve this situation. Any PixelCanvas implementer that provides it's own implementation of draw_horizontal_line would be able to eliminate most of the inefficiency. If you need it to run really fast you can implement every function in the trait with versions targeted at your specific needs. The beauty of this approach is it supplies functionality first while permitting performance later.

As an aside, You may think of pixels as a bunch of little squares. You may vociferously disagree with this notion. Here's a PDF written by someone who doesn't like little squares very much. Really, it all depends on what you are doing.

Pixels can be thought of as either an approximation of an ideal image or it can be the image itself. If you want pixel art, bitmapped fonts, lines that are only one pixel thick and you want to individually address pixels. Then it makes sense to think of the pixels as individually numbered little squares. pixel(0,0) is in the top left corner. For low resolution displays, every pixel is important, you want to be able to directly control them. If you want to turn individual pixels on and off, this is the mode for you.

The other way to look at it is pixels as point samples of a continuous image. This is the world of paths, strokes, fills and filters. The position of the top left pixel becomes a matter of (sometimes fierce) debate (0,0)? (0.5,0.5)? (-0.5,-0.5)?!?. The PDF linked above gives you a run down of the options. In this point of view a line is a mathematical line segment, without a specified width it would be invisible. Giving it a width makes it a rectangle (or perhaps something more exotic if you are using fancy line caps) The pixels themselves get set to an approximation of the shape. Anti-aliasing is usual here, but nice crisp single pixel lines are not, Accuracy in position takes priority so when a line falls between pixels, those pixels share some of the load each and you get a little two pixel smudge. If your pixels are small enough, and you are doing the right blending, and your display is calibrated correctly you might not even notice the difference.

The tiny set of drawing function I provide here are using the little squares approach, for the simple reason that it is much easier to write.

For our first interactive graphical program, We're going to make a program that I always start with when I teach kids programming. The JavaScript version that I use is;

print("Draw with the arrow keys");

var cx=320;
var cy=240;

function move() {
  // the arrow keys have key codes 37,38,39 and 40
  if (keyIsDown(38)) {    cy-=1;  }
  if (keyIsDown(40)) {    cy+=1;  }
  if (keyIsDown(37)) {    cx-=1;  }
  if (keyIsDown(39)) {    cx+=1;  }

I provide a few global functions in the environment that provide the I/O and timing. We have build up a Rust program that provides a similar level of a base, so we should be able to replicate this program. So starting with the rust_minifb example on github. We make a few changes. The TinyDraw project have be downloaded here. Or just view the source files
Here's what the guts of the program looks like.

    let (ticker_tx,ticker_rx) = channel();

    thread::spawn(move|| {
        loop {

    while window.is_open() && !window.is_key_down(Key::Escape) {

        if window.is_key_down(Key::Down) { cy+=1; }
        if window.is_key_down(Key::Up) { cy-=1; }
        if window.is_key_down(Key::Left) { cx-=1; }
        if window.is_key_down(Key::Right) { cx+=1; }

        if window.is_key_down(Key::Space) { frame.clear(); }


        frame.render_to_window(&mut window);
The Loop waits on the timer channel, checks the keys, draws a circle, then puts it on-screen. The frame object is a simple wrapper to the buffer from the rust_minifb example. frame is a FrameBuffer which knows how to put itself onto a window. More importantly, in we implement the PixelCanvas trait for FrameBuffer.

When I use this program as a teaching example. The first step I take after the kids have stopped drawing pictures (and the inevitable WASD conversion that kids these days apparently absolutely require) is to add a clear screen function. That is included in the program above at no extra charge. Clear screen actually has an ulterior motive. Any kid that is playing with drawing where they can clear the screen by pressing space will soon figure out that they can hold space down while drawing. Their eyes light up immediately, a circle is moving around the screen controlled by their key-presses. The drawing program has suddenly metamorphosed into the seed of a game.

And that's where I should probably stop for now. We clearly have not made a game. but there is enough there to make something that could legitimately be called a game. While the drawing functions are not spectacular, they give you a starting point. People wanting to learn Rust from a game perspective can use the seed as a starting point with all of the growth metaphors that may apply to such an endeavour.

Of course I'm going to keep working on this. My next stage is managing a collection of game entities. I don't know how to do that yet, So I'll have to learn a bit more Rust and get back to you later.

Tuesday, May 24, 2016

Evolved images using a Shader. Peliminary results.

In the post here I laid out the plan.  Since then I have made a simple WebGL implementation of an evolver and left it running on a few machines for a long time.

(if you don't want to read about it just go here to try it)

What I did

To render the entire image in one go I used a shader that renders 64 triangle pairs in one pass.    You pass it a 8x64 RGBA texture containing the triangle data and a pixel position and it gives you the image colour at that point.  Much of the texture is padded with zeros.  Fields that are smaller than a byte get placed into the most significant bits of a byte field each.

The in-texture structure I ended up with was

  a,b,c, colour, options, pad,pad,pad

Each texel is 4 bytes. so the shader can extract 2 points for each of the a, b, and c values to use as the two triangles.

The Shader

precision mediump float;

varying vec2 here;
uniform sampler2D data;

float dot2(in vec2 a) {
    return dot(a,a);

float smin( float a, float b, float k )
    float h = clamp( 0.5+0.5*(b-a)/k, 0.0, 1.0 );
    return mix( b, a, h ) - k*h*(1.0-h);

float tri_d(in vec2 p, in vec2 a, in vec2 b, in vec2 c) {
    vec2 ba = b-a;
    vec2 cb = c-b;
    vec2 ac = a-c;
    vec2 pa = p-a;
    vec2 pb = p-b;
    vec2 pc = p-c;
    float edge_ab = dot (vec2(-ba.y,ba.x),pa);
    float edge_bc = dot (vec2(-cb.y,cb.x),pb);
    float edge_ca = dot (vec2(-ac.y,ac.x),pc);
    float v= sign(edge_ab) + sign(edge_bc) + sign(edge_ca);
    return (v <2.0)         
        ? sqrt(min (min(
        dot2(cb*clamp(dot(cb,pb)/dot2(cb),0.0,1.0)-pb) ),
          dot2(ac*clamp(dot(ac,pc)/dot2(ac),0.0,1.0)-pc) ) )         
     : 0.0;


float shape_alpha(in vec2 p,
                  in vec2 a, in vec2 b, in vec2 c,
                  in vec2 d, in vec2 e, in vec2 f,
                  in int combineMethod, in float blur) {
    float tri_1 = tri_d(p, a,b,c);
    float tri_2 = tri_d(p, d,e,f);
    float result = 0.0;
    if ( combineMethod == 0 ) {
        result = min(tri_1,tri_2);
    if ( combineMethod == 1 ) {
        result = smin(tri_1,tri_2,0.05);

    if ( combineMethod == 2 ) {
        result = max(blur-tri_1,tri_2);
    if ( combineMethod == 3 ) {
        result = max(tri_1,tri_2);
    return max(0.0,1.0-smoothstep(0.00,(blur*blur)/2.0,result));    

//reads the data texture at point p. 
//The texture is assumed to be an 8px by 64px rgba image
vec4 peek(in vec2 p) {
  return texture2D(data,(p+0.5)/vec2(8.0,64.0));

void main() {
    float dataWidth = 4.0;
    vec4 fragColor = vec4(0.5,0.5,0.5,1.0);
    for (float poly = 0.0; poly < 64.0; poly+=1.0) {
      vec2 a_pos = vec2(0.0,poly);
      vec2 b_pos = vec2(1.0,poly);
      vec2 c_pos = vec2(2.0,poly);
      vec2 color_pos = vec2(3.0,poly);
      vec2 opts_pos = vec2(4.0,poly);  

      vec2 a1 = peek(a_pos).xy;
      vec2 a2 = peek(a_pos).zw;
      vec2 b1 = peek(b_pos).xy;
      vec2 b2 = peek(b_pos).zw;
      vec2 c1 = peek(c_pos).xy;
      vec2 c2 = peek(c_pos).zw;
      vec4 color = peek(color_pos);
      vec4 opts = peek(opts_pos);
      float blur = floor(opts.x*32.0)/32.0;   
      int combineOp = int(opts.y*256.0);
      if (combineOp >3) combineOp =1;

      color.a *= shape_alpha(here, a1,b1,c1, a2,b2,c2, combineOp, blur);

      fragColor.rgb = mix(fragColor.rgb,color.rgb, color.a);
    gl_FragColor = fragColor;  


Comparing the image and evolving were implemented in JavaScript with most pixel operations using typed arrays.    In practice,  that seems to work quite well.  It's fast.  Like any ray-marching demo on ShaderToy it is difficult to comprehend the sheer scale of work that goes into an entire image when every pixel is the result of a significantly complex calculation.

So the big question.  How did it do?

8192 bits of data
(0.125 bits per pixel)

That's where I ended up with the classic mona.png.  That's a Mean Square error of 236 (PSNR 24.4)   As a first attempt that seems pretty good.   This took about a month to evolve several hundred million generations.


Evolution copes well with bugs.  As long as the fitness function is accurate, evolution finds a way through.   The impact of bugs comes in the speed of evolution.  Fixing a few mutation bugs increased the speed of improvement a great deal.

The smoothed min shape operation allows for shapes with concave curves, but some curves in images were clearly being approximated by multiple  straight edges.   The capacity for convex curves might free up  shapes for use elsewhere

It keeps getting better.  In general progress gets slower and slower, but it never completely stops.  It's hard to determine where the limit is.  A really good mutation may provide a sudden burst of improvement.  Every time I have thought to myself that the MSE would not drop below a certain point I turned out to be wrong.   At various times I was convinced that the mona image would not drop below a MSE value of 300, or 250.

Mutation of the falloff value doesn't have as useful results as changing a Gaussian blur distance.   I believe this is because it has two immediate effects.  It changes both the overall size of the shape and the fuzziness of the edges.  When Gaussian blur was used to generate an edge gradient, changing the blur value changed the gradient without changing the overall position. It should be possible to  calculate a mutation for distance field rendering  that has similar effects.  If the gradient width is increasing by x, shrink the shape by x/2.

Triangle outlines seemed to cause unpleasant artefacts when two of the sides were needed and the third side was redundant.   Over time it seemed to be able to shuffle things around so that the extra lines were not as apparent but there seems to be scope for triangle outlines to be rendered with one edge removed.

Constructing the starting data set a gene at a time seems to help.  I start with a single gene (two triangles) and evolve until it goes 10,000 generations without a successful mutation before adding the next.   Sometimes the new gene does not find a fit quickly enough that they don't add much improvement to the overall image.  It's probably worth doing a run of a few thousand generations where only the new gene is mutated to let it settle into a place where it does more good.

To avoid stagnation in local minima, once the data contained the maximum number of genes, it would shake things up  with a large mutation if it  stagnated(10,000 generations without a successful mutation).   As the MSE got better, this would often result in a stagnation point that was worse than the previous best.  That meant it could get into a situation where it was taking one step forward and two steps back.   To resolve this I stored the best ever data and upon a stagnation it would try a large mutation of the best ever data.  If the new data stagnates at a worse level it reverts back to the best ever and tries from there.

What do the parts of an evolved image look like?

By rendering the entire image minus a single gene and measuring the difference in error, you can get an idea of how much each gene contributes to the whole.

For the Mona Image above the contribution graph looks like this

Most of the first few polygons have a very high significance.  They make the early gains of getting the broad areas of colour.

If we look at the fields generated for each gene

The easiest thing to spot is that gene significance is fairly correlated with area coverage.  That makes sense,  changing more pixels makes for more change overall.    It's also quite interesting to see that it is very hard to tell what image is generated from the combined genes.  In previous runs I placed a cap on the Alpha value of each gene, forcing areas to require multiple genes.  This made for more interesting interactions between genes, but slows things down somewhat.

Things to try Later

More gene expressiveness

While sticking with the model of turning three points into a shape.  A few more options are available.    

All of these should be possible to implement in a 2d distance field shader.  Of course then there is the issue of where to get the bits to store the form of the shape.  Currently filled/outline is chosen by the order of the points.  If the points go clockwise ->use fill, Or anticlockwise -> use outline.    Harvesting a few bits from somewhere else might be in order.

Theoretically a gene could consist of any 16 byte bundle of data that can be turned into an RGBA field by a shader.  If you were to reserve a byte for the technique then you could potentially have 256 different methods to process 15 bytes into a field.

Data Logging.

There are a number of things that could be tracked that might yield insights as to what mutations are useful.   There are various possible mutations,  colour, small point movement, large point movement, blur, and plain byte stomping.   Easiest to implement would be a frequency of success for each type, but for larger gains, analysis to find which mutations lead to future favourable mutations would be interesting.

Code Overhaul

I started with a Uint8Array for the texture that the shader used.  The rest of the code grew as functions that modified that Uint8Array.   It is of course, now a hideous mess.    Making an object/method interface would help but it would still have to be ultimately backed by a Uint8Array to avoid any unintentional state existing in the objects.  Evolution will exploit such errors.

Progress recording

Since the data for each image is only 1k,  It would be easy enough to keep a record of the evolution to play back.  I doubt this would yield any useful information, but it would look cool.

I want to try it

Of course you do.  This stuff is fun.  Watching things evolve is more hypnotising than watching the blocks reorganise during a disk defrag.

Go here Leave it in an open browser window for a while and see what you get.  If you are impatient you can manually add genes to build the population up to 64.  Or just preload it with a set of polygons and see if they morph to your target image.

This version has substantially better mutations than those used to generate the 1 month Image.

Here's a 2 day old image from the current version.

Mean Square Error = 201