(if you don't want to read about it just go here to try it)
What I did
To render the entire image in one go I used a shader that renders 64 triangle pairs in one pass. You pass it a 8x64 RGBA texture containing the triangle data and a pixel position and it gives you the image colour at that point. Much of the texture is padded with zeros. Fields that are smaller than a byte get placed into the most significant bits of a byte field each.The in-texture structure I ended up with was
a,b,c, colour, options, pad,pad,pad
Each texel is 4 bytes. so the shader can extract 2 points for each of the a, b, and c values to use as the two triangles.
The Shader
precision mediump float;
varying vec2 here;
uniform sampler2D data;
float dot2(in vec2 a) {
return dot(a,a);
}
float smin( float a, float b, float k )
{
float h = clamp( 0.5+0.5*(b-a)/k, 0.0, 1.0 );
return mix( b, a, h ) - k*h*(1.0-h);
}
float tri_d(in vec2 p, in vec2 a, in vec2 b, in vec2 c) {
vec2 ba = b-a;
vec2 cb = c-b;
vec2 ac = a-c;
vec2 pa = p-a;
vec2 pb = p-b;
vec2 pc = p-c;
float edge_ab = dot (vec2(-ba.y,ba.x),pa);
float edge_bc = dot (vec2(-cb.y,cb.x),pb);
float edge_ca = dot (vec2(-ac.y,ac.x),pc);
float v= sign(edge_ab) + sign(edge_bc) + sign(edge_ca);
return (v <2.0)
? sqrt(min (min(
dot2(ba*clamp(dot(ba,pa)/dot2(ba),0.0,1.0)-pa),
dot2(cb*clamp(dot(cb,pb)/dot2(cb),0.0,1.0)-pb) ),
dot2(ac*clamp(dot(ac,pc)/dot2(ac),0.0,1.0)-pc) ) )
: 0.0;
}
float shape_alpha(in vec2 p,
in vec2 a, in vec2 b, in vec2 c,
in vec2 d, in vec2 e, in vec2 f,
in int combineMethod, in float blur) {
float tri_1 = tri_d(p, a,b,c);
float tri_2 = tri_d(p, d,e,f);
float result = 0.0;
if ( combineMethod == 0 ) {
result = min(tri_1,tri_2);
}
if ( combineMethod == 1 ) {
result = smin(tri_1,tri_2,0.05);
}
if ( combineMethod == 2 ) {
result = max(blur-tri_1,tri_2);
}
if ( combineMethod == 3 ) {
result = max(tri_1,tri_2);
}
return max(0.0,1.0-smoothstep(0.00,(blur*blur)/2.0,result));
}
//reads the data texture at point p.
//The texture is assumed to be an 8px by 64px rgba image
vec4 peek(in vec2 p) {
return texture2D(data,(p+0.5)/vec2(8.0,64.0));
}
void main() {
float dataWidth = 4.0;
vec4 fragColor = vec4(0.5,0.5,0.5,1.0);
for (float poly = 0.0; poly < 64.0; poly+=1.0) {
vec2 a_pos = vec2(0.0,poly);
vec2 b_pos = vec2(1.0,poly);
vec2 c_pos = vec2(2.0,poly);
vec2 color_pos = vec2(3.0,poly);
vec2 opts_pos = vec2(4.0,poly);
vec2 a1 = peek(a_pos).xy;
vec2 a2 = peek(a_pos).zw;
vec2 b1 = peek(b_pos).xy;
vec2 b2 = peek(b_pos).zw;
vec2 c1 = peek(c_pos).xy;
vec2 c2 = peek(c_pos).zw;
vec4 color = peek(color_pos);
vec4 opts = peek(opts_pos);
float blur = floor(opts.x*32.0)/32.0;
int combineOp = int(opts.y*256.0);
if (combineOp >3) combineOp =1;
color.a *= shape_alpha(here, a1,b1,c1, a2,b2,c2, combineOp, blur);
fragColor.rgb = mix(fragColor.rgb,color.rgb, color.a);
}
gl_FragColor = fragColor;
}
Comparing the image and evolving were implemented in JavaScript with most pixel operations using typed arrays. In practice, that seems to work quite well. It's fast. Like any ray-marching demo on ShaderToy it is difficult to comprehend the sheer scale of work that goes into an entire image when every pixel is the result of a significantly complex calculation.
So the big question. How did it do?
8192 bits of data (0.125 bits per pixel) |
That's where I ended up with the classic mona.png. That's a Mean Square error of 236 (PSNR 24.4) As a first attempt that seems pretty good. This took about a month to evolve several hundred million generations.
Observations
Evolution copes well with bugs. As long as the fitness function is accurate, evolution finds a way through. The impact of bugs comes in the speed of evolution. Fixing a few mutation bugs increased the speed of improvement a great deal.The smoothed min shape operation allows for shapes with concave curves, but some curves in images were clearly being approximated by multiple straight edges. The capacity for convex curves might free up shapes for use elsewhere
It keeps getting better. In general progress gets slower and slower, but it never completely stops. It's hard to determine where the limit is. A really good mutation may provide a sudden burst of improvement. Every time I have thought to myself that the MSE would not drop below a certain point I turned out to be wrong. At various times I was convinced that the mona image would not drop below a MSE value of 300, or 250.
Mutation of the falloff value doesn't have as useful results as changing a Gaussian blur distance. I believe this is because it has two immediate effects. It changes both the overall size of the shape and the fuzziness of the edges. When Gaussian blur was used to generate an edge gradient, changing the blur value changed the gradient without changing the overall position. It should be possible to calculate a mutation for distance field rendering that has similar effects. If the gradient width is increasing by x, shrink the shape by x/2.
Triangle outlines seemed to cause unpleasant artefacts when two of the sides were needed and the third side was redundant. Over time it seemed to be able to shuffle things around so that the extra lines were not as apparent but there seems to be scope for triangle outlines to be rendered with one edge removed.
Constructing the starting data set a gene at a time seems to help. I start with a single gene (two triangles) and evolve until it goes 10,000 generations without a successful mutation before adding the next. Sometimes the new gene does not find a fit quickly enough that they don't add much improvement to the overall image. It's probably worth doing a run of a few thousand generations where only the new gene is mutated to let it settle into a place where it does more good.
To avoid stagnation in local minima, once the data contained the maximum number of genes, it would shake things up with a large mutation if it stagnated(10,000 generations without a successful mutation). As the MSE got better, this would often result in a stagnation point that was worse than the previous best. That meant it could get into a situation where it was taking one step forward and two steps back. To resolve this I stored the best ever data and upon a stagnation it would try a large mutation of the best ever data. If the new data stagnates at a worse level it reverts back to the best ever and tries from there.
What do the parts of an evolved image look like?
By rendering the entire image minus a single gene and measuring the difference in error, you can get an idea of how much each gene contributes to the whole.For the Mona Image above the contribution graph looks like this
Most of the first few polygons have a very high significance. They make the early gains of getting the broad areas of colour.
If we look at the fields generated for each gene
The easiest thing to spot is that gene significance is fairly correlated with area coverage. That makes sense, changing more pixels makes for more change overall. It's also quite interesting to see that it is very hard to tell what image is generated from the combined genes. In previous runs I placed a cap on the Alpha value of each gene, forcing areas to require multiple genes. This made for more interesting interactions between genes, but slows things down somewhat.
Things to try Later
More gene expressiveness
While sticking with the model of turning three points into a shape. A few more options are available.All of these should be possible to implement in a 2d distance field shader. Of course then there is the issue of where to get the bits to store the form of the shape. Currently filled/outline is chosen by the order of the points. If the points go clockwise ->use fill, Or anticlockwise -> use outline. Harvesting a few bits from somewhere else might be in order.
Theoretically a gene could consist of any 16 byte bundle of data that can be turned into an RGBA field by a shader. If you were to reserve a byte for the technique then you could potentially have 256 different methods to process 15 bytes into a field.
Data Logging.
There are a number of things that could be tracked that might yield insights as to what mutations are useful. There are various possible mutations, colour, small point movement, large point movement, blur, and plain byte stomping. Easiest to implement would be a frequency of success for each type, but for larger gains, analysis to find which mutations lead to future favourable mutations would be interesting.Code Overhaul
I started with a Uint8Array for the texture that the shader used. The rest of the code grew as functions that modified that Uint8Array. It is of course, now a hideous mess. Making an object/method interface would help but it would still have to be ultimately backed by a Uint8Array to avoid any unintentional state existing in the objects. Evolution will exploit such errors.Progress recording
Since the data for each image is only 1k, It would be easy enough to keep a record of the evolution to play back. I doubt this would yield any useful information, but it would look cool.I want to try it
Of course you do. This stuff is fun. Watching things evolve is more hypnotising than watching the blocks reorganise during a disk defrag.Go here Leave it in an open browser window for a while and see what you get. If you are impatient you can manually add genes to build the population up to 64. Or just preload it with a set of polygons and see if they morph to your target image.
This version has substantially better mutations than those used to generate the 1 month Image.
Here's a 2 day old image from the current version.
Mean Square Error = 201 |
No comments:
Post a Comment