The data-driven (2D) animation and dynamic equipment system in Idle Raiders



This post is going to detail the character animation system I implemented in “Idle Raiders” to support dynamic equipment (changeable weapons, helmets, etc.). The system was designed to be data-driven, i.e. new equipment types and animations can be added to the game without modifying any game code.

Similar functionality is common in many games, but finding detailed explanations about the inner workings of these system was surprisingly difficult. I want to make it clear that none of this is particularly complicated. Actually, it ended up being very simple in terms of algorithmic complexity. But when working in these things, even the trivial often eludes your sight because you usually need a broad encompassing understanding of the problem and the nooks and crannies involved to come up with a way to solve a concrete problem (no matter how simple the solution might turn out to be). For this reason I decided to write it up, so that others will have an easier time in the future.

First, I will give an overview for the workflow when adding new animations or equipment parts to the game, which will serve as a high level introduction. After that, I will give more details in how it’s implemented on the code side. I will try to stay language agnostic and rather describe the data and high level processes involved in running the system, and stay away from implementation specifics. For completeness sake: Our game in particular runs in Javascript (in the browser), and renders with WebGL or Canvas (depending on what the user supports).

About the game

Idle Raiders is an 2D “RPG character management” idle game, where you try to manage a group of ‘raiders’ while they try to best ever stronger growing monsters and “raid” ( a term used in MMOs) dungeons for precious loot and other goodies. You, as the player, only have indirect command in controlling the raiders: You can change their equipment and abilities, build a home base town for them, roughly determine the current activity they pursue, and do other things to make sure the raiders are prepared for what they want to do, but otherwise, the raiders are on their own. During combat, they act alone, and you can’t directly control them.

The game is gearing up for its beta release, and an (admittedly rough and buggy) alpha is playable on on Kongregate, should you wish to try it out.

Old and new workflow, from the content creator’s perspective

In the current publically available version of the game, all characters in the game use ordinary spritesheet animation. Here’s a sheet section for the Warrior’s animations (might be blurred on your browser, the game’s WebGL renderer displays it in a more pixelated look):


We licensed a character spritesheet from Oryx Design Lab and heavily modified it to suit our needs (since by default it only contains idle animations), and also added some entirely new drawings, such as the dragon from the second raid. The workflow just consisted of our artist drawing something in his favorite image manipulation program, and then creating animation data (frame timings, etc.) for use inside the game (at first with a small custom tool, later with Spriter). Nothing complicated, but also very time consuming. Since actually drawing each individual frame took such a long time, most animations only have very few frames in general: With few exceptions, all consist of two or three of them. Of course, changing equipment was practically impossible given this setup. At most we could support multiple equipment sets per creature type, where everything (armor and weapons) would be changed at the same time. Creativally quite limiting. For us… and the players. So we looked for something better.

Pixels with skeletal animation

In the new workflow, to support dynamic equipment parts, we create animations as skeletal animations in an animation tool (we use Spriter, but Spine for example would probably work as well), where body parts used in the various animations are split up into distinct pieces which are animated separately (more on that later).

These “equipment parts” can be changed individually at runtime. Here’s what that looks like in action by the way (all work in progress of course):

We quickly discovered that this was a much quicker way to add animations in general, although initially we feared that animating body parts by interpolating them around would mess with the pixelated art style of the previous spritesheet animations. This would have been very problematic – smooth animations just wouldn’t fit the lo-fi 16 bit look the entire game has.

Fortunately we quickly found out that what determines the look of spritesheet animations is not the way the body parts actually move, but instead the very limited number of frames in each animations and the resulting choppy look. By making generous use of ‘instant’ position, rotation, and scale changes, the new animations are visually not really distinguishable from ones that are drawn manually using frames in a spritesheet, precisely because they look “choppy”. Here’s a comparison of an animation where everything is smoothly interpolated, vs the same animation where everything is just snapped into place:

Note that just limiting the frames per second during playback of the animation is often not enough, because that forces all frames to have the same duration, where changes in the animation can only occur on the boundary time between two frames, and often more varied timings than that are required.

“Data Driven”?

We wanted new animations and equipment parts to “add themselves” to the game, without us having to change any code, or without having extra steps in the production pipeline where we for example have to edit some files and add a new reference to a new equipment part.

In order to facilitate this, we ended up using a combination of simple naming schemes for the animations and the files which represent different body parts. Two things in particular seemed good about that idea: It was simple to implement, as we didn’t have to write any new editors or things like that which take care of all the metadata, and everything that’s required for the artist to make new content data driven are things that would already have to be done in a non-data driven pipeline (where for example one of us coders adds new equipment parts into the code manually): It’s only done by naming files and animations (albeit with a specific syntax that is easy to understand).

So how does all of this work?

In animation tools like Spriter and Spines the animations and their skeleton/bone definitions are grouped together as “entities” (or similar concepts), which are basically supposed to contain all animations for a particular character type. For example, we have the “basic_humanoid” entity which contains all animations that are used for humanoids of varying sizes (like the raiders themselves):


As you can see, it contains various animation types for various weapon types. The naming here is important:

The entity name can be anything, and in the game is used to hook up entities in the game with the defined animations.

The animation names themselves define the different weapon types for that entity type in the game. When you create a new animation and name it “attack_arbitraryname” you define a new weapon type “arbitraryname” which will have its own animations. The name “arbitraryname” will later be used to look up equipment types for that weapon. You can optionally create custom animations for weapon types (such as idle_arbitraryname or walk_arbitraryname) if you want to have weapon type-specific animations, and the system falls back to defaults if no specific animation can be found.

There are some more extras to this naming scheme: For example, you can see “attack_1h2” in the list above. Our weapon animations support multiple variations which are used to make animations look less repetitive, and by adding a suffix number “x” where x is the number of the variation. In the example above, 1 handed weapons (“1h”) have two variations and the game automatically alternates between the two.

Not seen above are some extras like being able to define animations for specific weapons (by adding “@weaponname” as a suffix, for example “@inferiorkatana” or “@superioreuropeanbroadsword”) or specific entity sub types (by adding “!warrior” or “!archer”). You can imagine anything you want here, I think the naming scheme for animations themselves works well for anything that might be specific/different on an animation-by-animation basis.

The next part is the definition of the body parts an entity and its animations consist off. Here’s an example of the body parts for the “attack_spear” animation:


All of these are connected to a bone hierarchy, but that is not important and more about making animation for the artist as comfortable as possible, so I omit it here. Once again the names are important. They, too, will later be used to look up textures for new equipment types.

Different animations can use arbitrary different amounts and types of body parts, but body parts which represent the same thing (such as the head of the body) all share the same name in different animations. This is important because when we switch out equipment parts in the game, that named body part in the animation is essentially replaced. If we had something like “head001” in one animation and “head002” in another animation, we would have no direct way of recognizing that both of these are semantically the same thing, unless we apply some more parsing shenanigans.

It’s important to remember that body parts here are not ‘expected’ in the code: Our artist could add tentacles to basic_humanoids and the game would automatically load them and support the addition of new equipment parts for the tentacle body part. This is important because this allows us to create all kinds of monster types with different body makeups without adapting the code that runs it at all.

The last question we have to answer is how to add new equipment parts. This is easy, the equipment parts are just placed into special directories for each entity type, and then named using yet another simple naming scheme. The game’s engine takes care of the rest. Here are two examples for the “1h” weapon type and the “head” body part:


Their file system locations including the directories will look something like “animation_directory/basic_humanoid/head/head_xxxx.png” where basic_humanoid is taken from the entity name and “head” is taken from the body part name. The game will use this information to locate the correct texture for a body part when you, say, equip the “redarmor” helmet of the “head” body part. In this screenshot (and with all other body types we have currently), the names of the files also follow a syntax (“bodypartname_arbitraryname.png”) but that is optional: It’s nice because the names will be unique even if we should later decide to place all files into the same directory for whatever reason.

All animations are authored using some default armor set (in our case just the textures for the Warrior, and various default texures for different weapons), but animation tools should be able to swap them out temporarily to see if a body part fits (in Spriter, there are “character maps”).

Note that doing it this way does in fact allow for grossly disproportionate body part textures (such as huge heads or a sword where the handle for some reason is in a different position), the only thing you need to make it work is some way to define “origins” or “centers” of the textures where they are overlaid on the actual bone in the animation. In Spriter, we do this using the “pivot point” feature that allows us to set these origins on a per-file basis.

Making it run

Since we are now entering the code side of things, I want to mention that the ‘data driven’ part of this system only concerns the creation of new content and making it available in the game for easy use. If you want to, for example, have an ingame item of a certain type change the visual equipment part in a character’s animation, you will still have to take that internal definition of your item and hook it up with the correct animation sprite, a part of the system that says ‘when equipping this item, now set the texture of the sprite for the sword to a new texture named ‘sword_bloodysword.png’.

That is another part of content creation (creating items, in this example) and can also be solved in data driven manner. Explaining that is not part of this article, however. I may write articles about that in the future, because we have many simple and easy to use systems to create new items, abilities, quests, ‘timed stat changes’ (buffs) etc. in the game to make content creation easier, and those might be worth a read as well.

In explaining how we went about implementing this data driven system, I’m not going to go into how we actually compute and display skeletal animations . If you want to replicate our system, Spine and Spriter (probably other software, too) have implementations available for most popular languages so this should not be a problem. What I’m going to explain in the following is what kind of data we took out of our animation project, and what we did with it.

The data

Running the system in practice is not much more than keeping track of certain data and updating it when you change equipment. To set it up we extract some basic information from the entire animation project and the file system:

  • entity and animation names
  • body part names
  • all available body part images
  • pivot points for all the body part images

We use this data to create several lookup tables for stuff that will be commonly accessed, like

  • entity_type => [array of weapon types]
  • weapon type => ([list of weapon-specific animations],[list of creature-type specific animations])
  • weapon type => [list of animation variations for that weapon type]

Then we setup some data every time we create new creatures in the game (like the raiders, or monsters and NPCs). For each of these creatures we store

  • Rendering related data (like all the sprites for the different body parts)
  • The active weapon type
  • Default values for all the equipment parts. The default equipment parts are actually merged together as a single “set”, i.e. we store the “defaultEquipmentSet”: “archer” and when returning to default sprites the system uses head_archer.png, leftfoot_archer.png, rightfoot_archer.png, etc.
  • Default weapon type, and the default for the concrete weapon (e.g. default type “sword” and default weapon “firesword”)

In addition to the above, we store loads of additional data like animation durations, animation speeds, etc. But those are specific to our game. The data mentioned above should be more than enough to get the system running.

Changing equipment at runtime

The runtime part of the system exposes a function to change equipment parts of each body part or weapon. As parameters, the function takes the character to be modified, the body part name to be modified, and the equipment part type to be used and then does its magic. Like


This function is then used by our item system when equipping an item that is supposed to change the look of an equipment part.

In the implementation of this function we compute the actual filename for the respective image from the body part type name and the name of the concrete equipment part, and replace the texture on the sprite that is used to draw that body part with that image. In the game, when the player unequips all items on a character, the system goes back to the default values, which is why we store them when creating new creatures.

When changing equipment, it may be necessary to update certain data in whatever you use to actually run animation computations. For example, if the pivots/origins/’centers’ change with updated images from new body parts, you need to tell that part of your system about it, because that might change transformation matrices or other things internally and result in buggy animations otherwise.

Running animations

The last important part of the runtime implementation is how we actually run animations. When animating a type of animation (for example, attack animations when a raider is fighting) the programmer doesn’t specify the variation of an animation. Instead, we call high level functions in the animation system like “animateAttack(character)” and “animateIdle(character)” which automatically take care of properly executing animations that represent what the programmer is trying to do. These functions internally do a number of things:

  • If an animation has multiple variations, the system somehow (for example just randomly uniform) selects one of them.
  • Computes the actual animation name from the type of animation that is to be played (idle, walk, attack, etc.), the character’s entity type, active weapon type, and active weapon
  • Change the actual animation in the internal skeletal animation system.

The last bit is the actual updating of animation computations. Every rendering frame we call an update function in the animation system which takes care of the following things:

  • Animations may not need to run at 60 FPS. Skip some frames in that case.
  • Run the update of actual animation computations in the internal skeletal animation system.
  • Update the visibility of body part sprites depending on the active animation (these may also change during the animation itself). E.g. shields are not visible when a character is using a two-handed weapon.
  • Update Z-ordering
  • Take care of properly repeating and ending animations (idle and walk animations for example loop, while attacks typically do not)

And that is it! You now run a flexible animation system with dynamic equipment!

Using animation data for gameplay: Query at runtime, not at creation time!

Animations have information contained within them that are used for gameplay, such as their durations (to make sure the character doesn’t move around while executing an animation) or certain events in the animation (such as the point in time when an arrow is sent flying from the bow).

Before updating the animation system from our old spritesheet-only ways, we used to read that data once when animations where loaded, and then hand that to abilities (such as a basic melee attacks or ranged attacks where characters fire a projectile) on creation.

However, supporting multiple animation variations means that this information can change when a different variation is executed. You might have a quick sword stab that takes half a second, and a variation where the sword is swung like a cleaver which takes 1.5 seconds.

The solution obviously is to query this data every time an animation is executed, not just once at creation time. We solved this by maintaining some internal state in the animation system which tracks the current animation each character is executing, and then just query the durations and event timings from that.

An alternative solution to this (which we don’t use, because so far we don’t need it) is to give the responsibility of picking animation variations to gameplay code. Abilities could use this to chain together specific types of animations for combos, for example. This was not an important feature for us, so we decided to hide it behind the slightly higher level animation system API to make animation usage simpler.

What our system doesn’t do (yet), and future work

For the future there are various different areas of interest I would like to explore for our animation system.

We could use animation blending to automatically automatically create animations for activities such as idling and walking by blending together a default animation with default weapon stances for each weapon type.

I would like to add dual wielding (e.g. carrying dagger in left hand, sword in right hand) to the engine at some point. This is non-trivial because if you want to allow any combinations of one handed weaponry, you need some way to automatically transfer a weapon animation from one hand to the other, for example, unless you want to create left and right handed versions of all weapon animations. Since another bit of content creation overhead would be unacceptable, most things here should be data driven and procedurally generated.

We render separate sprites for each body part which can be batched on most platforms. However, this is sometimes not possible (such as HTML5 Canvas). Because we are going for ‘spritesheet’ look anways in our use case a reasonable optimization might be to render the animations into atlas textures with a fixed framerate and later render sections of that atlas in its entirety instead of rendering single body parts. I will have to investigate the results in performance, which might turn out to be nothing if canvas engines end up doing proper batching in the background anyways. I will also have to investigate effects on the visuals since fixed framerate might lead to degradations in how it looks in edge cases, and also aliasing issues when rendering to and from textures might result in problems. Not to mention that using this system for all characters in the game might lead to high texture memory requirements.

Closing words

If you have dealt with this before I’d love to hear how you solved this problem in your games and if you did anything differently. In particular, if you know about any cool stuff we could do with this new (or related) technology, or noticed that we horribly botched something at some point, be sure to let me know in the comments!


I made a small game


First of all, the game itself is up at gamejolt, kongregate, and our own site. If you want to try it out, you can play it at one of these places:

When you play the game on our own site it looks like this:


Your objective is to swap neighboring blocks such that they build chains where the value difference between one block and its neighbor +1 or -1 and the amount of blocks in the chain is at least 3. After a chain is assembled, all the blocks in it are destroyed, the remaining blocks fall down to fill the holes and new blocks are spawned in the remaining empty slots. Players receive points depending on the amount of blocks destroyed, being capped at 100 if the player manages to destroy 9 blocks in the same move. This scales non-linearly, so it’s important to try and build long chains instead of many small ones.

The game ends when there are no more moves possible that would connect a new chain. To give an example for the chains, in the following image (taken from the tutorial), the blocks 4,5 and 6 make a chain as soon as you swap the 4 with the block below it, but combinations like 4->5->4 or longer chains like 1->2->3->2->3 work as well.


Screenshot of the grid in tutorial mode. 4->5->6 will make a chain and disappear when you swap the 4 with the 0 below it

The following blog post goes into various aspects of the development of this game, and I will give a journey through the various implementation stages the game went through, right from prototyping to final release.


When I initially kicked some game ideas around in my head, I had a clear goal in mind: Since I had not yet completed any actual games, the game should be very limited in scope, to make sure I can actually complete it. After playing 2048, I thought that surely there must be an infinite amount of things you can do with that kind of number game setup. And soon enough, I had conjured up a simple prototype using Löve2D, and I gave myself a time window of one month to bring the project to completion. The goal was to publish the game after the given deadline, even if things remained that could be improved. A major motivation was to get an idea of the impact a strict deadline has on the design and implementation process. The rest of this blog post will describe this process and hopefully dispense some wisdom for the interested reader.

 Version History

First of all, let me give you a rundown of the various versions the game went through during development.

The total development time up until the final version (minus some hotfixes after “1.0” was pushed to the master branch) was about 6 weeks. At first I implemented a simple prototype in Löve2D, which was created to test if whatever I had in mind actually was any fun to play:


And indeed the gameplay had some charm to it. While it certainly wasn’t the most complex game, it was on some level mesmerizing to kick off a chain of reactions and observe the blocks engage in their little dance. I quite liked the prototype and it seemed to fit the scope I was looking for.

The next step was to port the prototype to HTML5 and Javascript, since that was the platform I targeted. I barely had any experience in web development before this, and although I made quick acquaintance with cross-browser compatibility problems, the port was finished within a couple of days and ran comfortably on the targeted web (IE10+,Firefox,Chrome, Safari) and mobile browsers (Safari/Chrome).


After the port was done I spent a week just fiddling around with some variations to the gameplay formula. For example, I reworked the game into a version where you had a much bigger grid of blocks, where destroyed blocks weren’t replaced by new ones.


One of the derivatives of the original gameplay formula I tried out during the 1 week long experimental phase.

Various goal states were possible, such as destroying as many blocks as possible before a timer ran out, or before there were no possible moves left on the board.

However, the original concept remained the one that I stuck to. It seemed simpler to grasp for new players and the alternatives didn’t add anything particularly mind blowing to the formula, so I decided to keep it simple.

The following week was spent on overhauling the look of the game, as I didn’t want to expose the eyes of innocents to the spartan programmer art look of the prototype. With some assistance of fellow company founder and artist in training Alexander I came up with something that was already very similar to the final version:


The two main colors of green and silvery-gray are inspired by some internal mock ups of our startup’s logo that I’d seen, and seemed to fit the “calm&tranquil” feel of the game I was going for.

Then, a week before the initial date for final release, it was time to release the game as a beta to a close circle of family&friends. Feedback was mostly positive, and with the help of the testers we uncovered many browser-compatibility related bugs that could be fixed before release of the final version. I was actually quite surprised by the amount of feedback I received. Almost every tester gave some form of feedback, mostly in the form of bug reports, but also in the form of suggestions for the game itself, some of which made it back into the game.

I spent the remaining time until the planned release date by implementing a simple banner ad at the bottom of the page, as well as Twitter & Facebook share buttons, which was much simpler than expected. I also added the current tutorial, and buttons for resetting the score and going back to tutorial mode.

Shortly before what would’ve been the release date of December 3rd, I realized that I wasn’t happy at all with the pixelated look the game received on some devices. The initial versions were sometimes quite tiny on devices with large resolutions on small screens, so my response was to scale the contents of the site to cover an appropriate amount of the browser window. But on some devices, particularly maximized desktop browser windows (where 1080p resolutions are standard by now) or tablets with large screens but low internal resolution, the blocks themselves as well as the rendered text looked a bit too pixelated for my taste, as can be seen in the following screenshot:


So I pushed back the release date by a week so me and Alexander could give the visuals another pass. The final result, of course, is what the game currently looks like, smoother looks and all:


Design (and its failings)

While the game design remained the same since the earliest versions of the prototype, its biggest problem could only be felt after more extensive play testing in the later stages.

What I’m talking about is luck dependence. For me, a perfect game is designed in such a way that a skilled player always performs better than a bad player. In 9blocks, a player who patiently decides to plan ahead instead of blindly connecting the first chain possibility they see will consistently make more points than a worse player. But this doesn’t protect them from being eliminated from a single game by sheer bad luck. The result here is a setup that is very similar to card games like Poker, where playing well gives you better results in the long run, but single games can often seem to depend entirely on luck.

The reason for this is the following: The fact that blocks fall down before new ones are spawned on top means that the bottom two rows of the board tend to converge on a stagnant state where no chain connections are possible. It’s possible to keep this from happening to a certain degree, by carefully considering your moves and trying to keep a healthy mix of numbers embedded in the lower rows, trying to work down favorable blocks from the top of the board without destroying them. But this can not be prevented all the time, eventually resulting in a situation where the only thing which can rescue you is a favorable series block spawns in the top rows of the board.

This is perhaps my only regret with how I handled the development of the game. The week where I tried around various spin-offs derived from the original formula should probably have been spent brainstorming and implementing ways to get rid of the luck dependency problem.

In the visual department, one thing I wasn’t able to solve was to find a way to visually assist the player in finding chains. I brainstormed some ideas, such as using various different block shapes or colors to maybe make use of some of the brains ability to pre-consciously recognize features, but none of them panned out. Particularly the idea of featuring different shades of green for the numbers seemed promising at first, but after actually implementing and testing it, I saw that it looked plain ugly and didn’t really help much at all.

The problem here is that our visual system is mainly good at classifying and associating groups of things in our view, while the things 9blocks requires can’t be grouped very well. I could shade 1, 2 and 3 in the same color to signify that these shades of green can make a chain, but this would break any possibility of assisting the player in finding chains containing the numbers 3,4 and 5. Also, it’s better for the player’s score if they try to connect long chains. All ideas I was able to come up with only seemed to assist the player in quickly finding shorter chains. After a while I just had to drop the concept of visually assisting the player since I was running out of time.



Thanks to its simplicity, the layer of tech for the game is pretty minimalistic as well. It uses pixi.js for rendering and tween.js for juicing up everything that moves. I also use howler.js for the background song, but after picking and licensing the song, after some play time I noticed that it just doesn’t seem to fit the game and is somewhat distracting after longer play sessions. For this reason I decided to leave it disabled by default, and the user now has to activate it intentionally by clicking on a button in the bottom left of the canvas. I have to admit that the music is only left in the desktop browser version because I feel bad for picking an unfitting song and paying money for it.

Also, on mobile devices the sheer added download size (just over 1mb for the .aac and 500kb for the .ogg versions. AAC is required on some platforms which don’t understand ogg) proved very bothersome to the loading times of the game when using cellular data connections, so I simply removed audio from mobile versions completely.

Overall I was pleasantly surprised with the state of HTML5 availability and canvas/WebGL performance on various browsers and devices. The vast majority of builds I tested just ran identically on all browsers out of the box. The cross-browser issues I found were mostly minor and fixes were found quickly thanks to Google and sites like StackOverflow. I do think some bugs probably remain on some browser configurations, such as the various default browsers that get shipped with many Android devices. I have found and fixed some incompatibilities regarding these during development, but unfortunately, with the variety of devices, OS and browser versions out there, it’s hard to get full coverage in a small beta like the one I ran.

Why HTML5?

Of course there remains the question of why I chose to go the HTML5 route at all. I chose it mainly because it seemed like an easy way to have a fully cross-platform game without the relatively long deployment & testing cycle that I get with native apps. In addition, a browser version of the game for desktop users was crucial to me. When I started the project I knew of many frameworks  which handle cross-platform development between mobile and desktop applications very well, but I didn’t want to force desktop users to install a desktop app to play the game. I was also aware that for most mobile users it’s a preferred option to install an app from the Google Play Store or the App Store once and play it using the native application, so I felt like I had to make a choice and decided to go with HTML5.

Unfortunately I was entirely unaware that there exist frameworks like Cocos2d-JS (which I will use for my next project), which neatly abstract away the platform difference almost entirely, and allow deployment to web & native apps with few changes to the code base. It would also be possible to package 9blocks into native apps using frameworks like CocoonJS, but preliminary research into these has revealed that it wouldn’t be trivial and would require some implementation time that I do not have. So this game remains web-only for now. Which isn’t meant to imply that this is somehow a bad option, of course, as the game is perfectly playable from the browser and the total download size for mobile users amounts to merely 300 kilobytes.

Regarding development process, coding style and software architecture

From past experience, I had the strong impression that software architecture actually matters very little in one-man projects of limited scope where you know the entire code base inside out.

Taking care of properly abstracting things to minimize the code you need to touch when implementing changes nets you not much or no time at all when the code to be abstracted in question is only used in two or three places in the code base. And small projects are often full of code that is in fact only used one single time. Abstraction in these kinds of projects also yield no code that is easier to understand, since you already know everything anyways. I felt that the time taken to implement proper abstractions doesn’t warranty the time you win working on the code vs. “dirty” code.

Knowing this I went into this project with the conscious goal to keep the software architecture simple and the call stack as flat as possible. Surely enough, over the course of development the code changed quite a bit: The initial HTML5/JS port, although completely playable and pretty much finished gameplay wise, weighed in at 600 lines of code, and grew to 1500 in its current form. This and the many bug fixes during course of development gave me plenty of opportunity to put my theory to the test. Would I be hindered by my reliance on “throwaway”-style code?

It turns out not at all. The experience of working on the code has been extremely pleasant. Knowing the entire code base made it possible to identify the causes of bugs and places where I would need to implement changes for new features almost immediately, and working on data in-place instead of following it down various levels of abstractions made debugging a very swift experience. Adding new code was done in a quick and dirty manner, but the “dirty” didn’t weigh in negatively, so all that was left were quick, testable results. In short, it felt like the amount of boilerplate code I had to work on was absolutely minimal, and the code lines I touched almost always had a direct result on the end product. A satisfying experience to say the least.

There was also no formal development process methodology used during the development of this game. I did cut up the 1 month of time I gave myself into the parts mentioned in the version history part of this blog, which was certainly tremendously helpful, but otherwise the methodology could only be described as “highly agile”. I kept a to-do list of bugs and features that were left to implement, and pretty much arbitrarily picked things to work on based on my current mood. This worked out very well as I never felt like something in the surrounding process beyond coding the game itself was keeping me from getting things done, which I think is what development processes try to fix.

Overall, I do not regret the way I approached it at all and would encourage others to engage in a similar style for small projects, as long as you work under similar conditions to mine.


Overall I’m pretty happy with how development played out. Even though I didn’t work full time on the project, and had to juggle several other of my masters degree projects and lectures on the side, I was able to (mostly) stick to the original timeline. I’m also satisfied with how the game turned out. Although it is by no means perfect, it is enjoyable and I’ve certainly learned a lot from the experience.

Also, on a last note, if you have any feedback, or wishes, or bug reports regarding this blog post or the game itself, make sure to mail them to social [[at]], it would be much appreciated!

Bachelor Thesis: Realtime Diffuse Global Illumination for dynamic scenes

I wrote my bachelor thesis on global illumination, specifically dealing with realtime implementations of diffuse GI algorithms for fully dynamic scenes (i.e. you can go bonkers with changing geometry, light sources, camera, etc.). The thesis isn’t perfect and lacks some useful evaluation (because I ran out of time), e.g. I didn’t look at how multiple light sources affect performance. But you should still get a good picture of performance, quality of the resulting images, and overview of the algorithms themselves.

The file below contains the VC++ project + all the source files. There is a readme in the folder describing what you need to build it, and where to look in the code for the stuff that probably interests you the most. This should be compilable linux/mac/etc. as well, since I (as far as I can recall) don’t use any platform specific code. But it lacks a cross-platform build system (e.g. cmake).

The file below also contains the compiled application for win32 platforms. It also contains the assets of the Crytek Sponza scene used (which are slightly changed since the original files were buggy). There’s a readme in the app folder that explains how to use the application (there are a few hotkeys involved).

And finally it also contains the thesis. A few of the references broke when I converted to PDF, and I don’t know how to fix it. It inserted the titles of referenced chapters and the text under figure references, which is quite annoying where it broke. But it’s only a couple of them, the rest should be fine.


Feel free to shoot me any questions you have about the thesis, or the code, or anything else.

WordPress doesn’t support the uploading of .zip files, so I renamed it to pdf. You can just changed it back to .zip in order to unpack it.

Memory optimizations for vertex attributes

When searching for general optimization techniques on the net you’re often told to keep memory bandwidth usage at all stages of the rendering pipeline as low as possible. This post is specifically about reducing the bandwidth used when streaming vertex attributes to the vertex shader to make them available there. I go into detail for a hypothetical OpenGL 3.x+ (“Desktop GL”) implementation of the changes, but it should be analogous in DirectX or OpenGL ES as well.


Your vertex shader will require a number of input attributes that depend on what the shader is doing specifically. In my example, I’m using attributes that are probably common for vertex shaders that set up meshes for the fragment/pixel shader. It has the following inputs

  • Position (vec3)
  • Normal (vec3)
  • Texture coordinates (vec2)
  • Tangent of normal for normal mapping (vec3)
  • Bitangent of normal for normal mapping (vec3)

These will probably be different for you. For example, many implementations don’t hand over tangents and bitangents as attributes, as it’s also possible to calculate them on the fly in the shader. You may also have additional attributes, such as per-vertex material IDs or something else that is required by your rendering pipeline. But these will do for the example.

For each distinct vertex then, your graphics device somehow has to stream all necessary data for that vertex from far away memory to some memory that’s nearer to the shader execution units (how exactly this is done and what types of memory there are depends on the hardware). This naturally consumes memory bandwidth, which is limited, and there’s also a small amount of latency associated with actually executing a memory fetch into main memory. Note that I said “for each distinct vertex”. If you use indexed drawing and use a vertex multiple times, your hardware implementation probably caches the required data somewhere so it doesn’t need to be fetched from far away memory multiple times.

The problem then, simply put, is to somehow reduce the required memory bandwidth and memory fetches to improve performance of rendering.

Strategy 1: Reduce size of stored attributes

One way to reduce memory bandwidth is to reduce the size of the actual data that is stored in memory for the vertex attributes. Remember: Getting data to the vertex shader is usually done like this

  • Create vertex buffers using glGenBuffers(GL_ARRAY_BUFFER,…)
  • Fill those buffers with data (e.g. glBufferData(…))
  • Before rendering, tell OpenGL to use (and how to use) that data with glVertexAttrib[ / I /L]Pointer()

You supply the data type of the stored vertex attribute to glVertexAttrib[]Pointer, and logically you could reduce the amount of memory bandwidth if you somehow managed to send, e.g. a GL_HALF_FLOAT instead of a GL_FLOAT to the shader. The memory bandwidth for that specific attribute would be cut in half, and the amount of memory fetches would also go down, since fetches don’t just fetch a single attribute at a time, but a fixed size of bytes (e.g. one fetch could get 64 bytes out of memory, but the size of this depends on your hardware) and now more attributes fit “into one fetch”.

It’s important to note how the stuff specified on the client-language side (C++ etc.) actually arrives in the shader. Vertex attributes, as of Opengl 4.4, can only be of these data types:

  • 32 bit floats
  • 64 bit floats (double)
  • 32 bit int and uint
  • vectors of above types (e.g. vec3 for 3 32 bit floats, uvec2 for 2 unsigned integers,etc)
  • matrices of 32 bit floats

This means that even if you hand over shorts or bytes with the respective glVertexAttribPointer() calls, they will be converted to (unnormalized, unless specified) 32 bit floats before the execution of the shader, 32 bit integers if you use glVertexAttribIPointer(), or 64 bit doubles if you use glVertexAttribLPointer(). As far as I know, this conversion is basically free in terms of performance.

Solution 1: Use the smallest data types that fits your data

In my example application, all values of all attributes are 32 bit floats, because they were naively copied to vertex buffers from the memory that my model loader provides me with.

This is a waste, however, because not all data requires 32 bits of precision to carry its information. For example, for texture coordinates, the amount of bits that you need only depend on the size of the biggest texture in your scene. A 16 bit (unsigned) integer can represent all numbers from 0 to 65536, which means that two 16 bit integers can correctly supply texture coordinates for textures of a maximum resolution of 65536×65536. For the vast majority of rendering applications (maybe with exceptions such as megatextures), this is more than enough. In your typical game, the largest texture you’re going to find is probably no bigger than 4096×4096.

So for texture coordinates, instead of saving a lot of 32 bit floats into my vertex buffer, I just used 16 bit half floats and changed the type parameter in glVertexAttribPointer to GL_HALF_FLOAT instead of GL_FLOAT. I didn’t have to change anything in the vertex shader, as OpenGL handles the conversion for you.

Note: I use GLM, which has a “half” data type which represents 16 bit floating points. Usually these will not be a native part of your language (such as C++).

Fiddling around with different combinations, I found that 16 bit data types were enough (as in: No difference in the output images) for texture coordinates, normals, tangents and bitangents. Changing vertex positions to 16 bit caused positioning artifacts, as some positions in the used model (Crytek Sponza) shifted a little due to lower precision. Depending on your other attributes, some may do fine with 8 bit data types as well (for example, vertex colors or material IDs).

Testing this on my laptop with a GTX 555m, the FPS with the Sponza scene (which has about 200k vertices) and a straight forward blinn phong lighting model + standard shadow mapping boosted the FPS from about ~50 to about ~80 FPS, so it was definitely worth it.

Strategy/Solution 2: Don’t use interleaved vertex attributes

There are essentially two ways to store vertex data: Interleaved and planar. Interleaved means that different vertex attributes in the vertex buffer are not separated by their type, but are instead mixed to follow in a repeating pattern one after the other. For example, for positions, normals, and texture coordinates, interleaved would be


And planar would be


When using the two different versions with shaders that used all attributes (such as mesh rendering) the performance of both versions was identical. However, there was a slight speedup on my PC (GTX 770) when used with a shader that only required a subset of the data. Specifically, my shadow map renderer only needs positions and normals, and the performance there shifted from 210 to 220 FPS. The difference would probably be larger in an application where I was memory bandwidth bound (which I wasn’t with the Crytek Sponza scene), but even with that tiny change in speed (210 to 220 FPS is about 0.4 ms, or in other terms, not even enough to go from 60 to 61 FPS), you have to consider that the change is practically free (not a lot of code changes required).

The question is: Why would planar format be faster when used with shaders that don’t require all attributes?

Well, I have no proof, but I assume it’s because of the way data is fetched from memory. As already mentioned, when your graphics hardware accesses memory, it doesn’t just pull out single floats/integers, or even worse, single bytes. Instead it operates on so called “cache line sizes”, and pulls a large amount of bytes out of memory with a single fetch.

So for example, if that cache line size was 64 bytes and you used only positions out of positions, normals and texture coordinates (3×4 byte , 3×2 byte and 2×2 byte = 22 byte) in an interleaved format, the GPU would pull 3 positions in a single fetch and discard the rest of the data (since you don’t need it). If you used a planar format, where all position attributes are next to each other in memory, the GPU would fetch about 5 position attributes per memory access. Because a decent chunk of memory bandwidth is wasted in the first case (and there’s probably also more memory latency due to more fetches, even in spite of latency hiding mechanisms that GPUs use), the planar layout performs a little better.


This might depend on the used rendering hardware, but on my PC (GTX 770) and laptop (GTX 555m) the results were the same. The results might differ drastically for different architectures, such as consoles or mobile hardware, but somehow I doubt it (can’t think of a reason why it would be different there).

Other strategies

There are other strategies to further reduce memory and bandwidth usage for vertex attributes. For example, as mentioned, it is possible to compute normal tangents and bitangents in the shader as opposed to calculating them “offline” and streaming them to the shader. This might be faster since, usually, GPU ALUs are obscenely fast, while memory (both bandwidth and size) performance hasn’t accelerated at nearly the same rate. This would free up memory bandwidth and size for other rendering passes which dearly need them (such as lighting passes in deferred shading). If you don’t need the extra normal precision, normals may be stored in an 8 bit format. Texture coordinates can be stored in an 8 bit format as well for meshes which only use textures of a size smaller than 256×256.

But alas, I have not found the time to implement and test further changes. Those might be content of future blog posts.


Using smaller data types for appropriate data improved rendering performance on my test hardware (GTX 555m) with the test scene (Crytek Sponza) from 50 to about 80 FPS (~60% increase). No difference was measured on the beefy machine (GTX 770), because the bottlenecks there were in way different places.

Using non-interleaved (planar) vertex attribute layout in memory gave about 0.4ms better performance on a GTX 770 and 0.6ms better performance on GTX 555m.