Back to Optimizing

It's been a while since I lost wrote, but I have a good excuse, I've been focusing on creating the Android version of my game. The good news is that everything is working now. It's running on both Android 2.3+ and a slightly improved version for Android 4.4+ (it shows in full-screen). What I'm working on now is optimizing it to get the performance I want. I've written articles about optimization in the past, so I'll keep this short and to the point of what I've done so far.

Measure, Measure, Measure

I've written about this in the past, but I've found that when you're trying to optimize something it is extremely important to measure often and enough. Also, optimizing for a focused measurement doesn't always result in improvements in the real environment.

To try to do a better job at measuring I've been using histograms of the speed of what I'm measuring to get a better picture than just an average speed. I'm also using the profiler functionality in Visual Studio, but I can only use that for the PC version of my game, not directly against the Android binary. Still this led to useful insight.

Two Optimizations

When I profiled my code I found two areas I thought I could improve relatively easily. The first was an unnecessary calculation when I manipulate (translate or rotate) quads. I was recalculating the center of the quads even though for these operations that wasn't necessary. This was pretty straight-forward to improve and resulted in a significant speedup. You can see this in the picture to the right: PC-L-SS is the original version, and PC-L-SS-Opt Mutable is this first optimization.

After the above I ran the profiler again and determined that a significant amount of time was being spent on calculating sine and cosine operations. I did some Google searching and found an interesting discussion that describes a faster way to calculate sine (the source of this information contains more details). So, I implemented this alternative sine calculation and measured again. This resulted in the PC-L-SS-FastSin numbers in the histogram.

The average FPS numbers for the above are: original = 1,114, first optimization = 1,380, and second optimization = 1,411. Note that this is just running my game logic code, but not actually rendering via OpenGL. I did this in order to focus as much as possible on the area I was optimizing.

But What About Android?

So, the above is interesting and all, but I'm trying to optimize Android, not my PC version. So, since I can't use the profiler, I can just run the same measurements. This resulted in the histogram to the right.

Clearly from this both the first optimization (Opt Mutable) and the second additional optimization (Opt Sine) are significantly faster than the original. To summarize this even more, the average FPS for each are: original = 37.29, first optimization = 42.95, second optimization = 43.45. In this case the numbers are both the game logic and the actual rendering. Because I'm trying to evaluate the true game performance, in my attempt to get to 60 FPS, I thought this was more useful.

Still More Work to Do

While I'm pretty happy with the results of this initial optimization, it's still not performing as well as I want. The Android measurements above were done on a Nexus 4, and I'd really like it to run at 60 FPS with some headroom for slower devices. My profiler investigations have given me some ideas for more optimizations, but I expect them to be more difficult than the above.


And...my first AppStore update

I pushed a minor update to Zoing to the AppStore on Saturday and it just went live. It's not particularly interesting in terms of changes to the game, but it's exciting for me given this is my first one ever.

The actual update just rearranges the icons on the start screen to try to make it easier to use. Now I'm working on Android support, so I don't expect any new iOS updates until after the Android version is released. Although, before then, I expect the ad-supported free version for iOS to become available.



This post is a little delayed, but I finally released Zoing to the App Store about two weeks ago! It's available in all countries. Please check it out and write a nice review, 5 stars would also be great.

I also just submitted the free, ad-supported version to the App Store two days ago. Apple typically takes about 10 days to review new apps, so I'll announce it here when that's available. It's exactly the same game, but shows ads on some of the screens, although not on the main game screens.


Who stole my pixels?

Moving away from the overly detailed discussion on alpha layering from my last post, today I want to briefly discuss developing for iOS and taking advantage of the full resolution of retina devices. Although I knew some care was needed in order to use the full high-resolution retina display, for some reason I assumed this was already happening. Recently, however, I realized that that wasn't true. A little Googling found lot's of info on this and it all comes down to scaling between points and pixels.

Scaling the heights

There are many discussions on this topic, so I won't go into too much detail, but I did want to mention how I addressed the problem. Briefly, for those who don't know, when Apple released their first high-resolution retina devices they did it in a careful way that allowed all existing apps to still run and look good. Basically they added a new property of the screen that defines what it's scale is. The official Apple documentation for UIScreen.Scale says:

This value reflects the scale factor needed to convert from the default logical coordinate space into the device coordinate space of this screen. The default logical coordinate space is measured using points. For standard-resolution displays, the scale factor is 1.0 and one point equals one pixel. For Retina displays, the scale factor is 2.0 and one point is represented by four pixels.

To use this scale I needed to do three things:

  1. Set the ContentScaleFactor of my EAGLView, which is a subclass of iPhoneOSGameView to the same scale so that my rendering uses the entire high-resolution screen
  2. Tell my game the full high-resolution size of the screen
  3. Multiply all of my coordinates by the defined scale factor

For the first item I both get and remember the scale factor and set the ContentScaleFactor to it in the initialization of my EAGLView. Because I want my game to run on older devices with older versions of iOS I need to first check that the new scale property exists and then only reference it if it does. The purpose of ContentScaleFactor is defined by Apple as:

Hmm, reading over this now I just noticed that it says I shouldn't have to adjust the value of ContentScaleFactor. I'll need to go back and look at my code again.

The scale factor determines how content in the view is mapped from the logical coordinate space (measured in points) to the device coordinate space (measured in pixels). This value is typically either 1.0 or 2.0. Higher scale factors indicate that each point in the view is represented by more than one pixel in the underlying layer. For example, if the scale factor is 2.0 and the view frame size is 50 x 50 points, the size of the bitmap used to present that content is 100 x 100 pixels.

The default value for this property is the scale factor associated with the screen currently displaying the view. If your custom view implements a custom drawRect: method and is associated with a window, or if you use the GLKView class to draw OpenGL ES content, your view draws at the full resolution of the screen. For system views, the value of this property may be 1.0 even on high resolution screens.

In general, you should not need to modify the value in this property. However, if your application draws using OpenGL ES, you may want to change the scale factor to trade image quality for rendering performance. For more information on how to adjust your OpenGL ES rendering environment, see “Supporting High-Resolution Displays” in OpenGL ES Programming Guide for iOS.

Here's the code that does that:

_screenScale = 1;
if(UIScreen.MainScreen.RespondsToSelector(new Selector("scale"))
   && RespondsToSelector(new Selector("contentScaleFactor")))
    _screenScale = ContentScaleFactor = UIScreen.MainScreen.Scale;

For the second item I simply multiply the screen size by the scale I determined above via this code snippet: new Size(Frame.Size.Width, Frame.Size.Height) * _screenScale.

Finally, for the third item, I perform another multiplication of the scale when I handle touch events.

/// <summary>Called when touch events end.</summary>
/// <param name="touches">Touches.</param>
/// <param name="e">The touch event.</param>
public override void TouchesEnded(NSSet touches, UIEvent e)
    base.TouchesEnded(touches, e);
    UITouch touch = (UITouch)touches.AnyObject;
    _lastTouch = touch.LocationInView(this);
    _screenManager.Screen.HandleTouch(new Point(_firstTouch.X, _firstTouch.Y) * _screenScale, new Point(_lastTouch.X, _lastTouch.Y) * _screenScale, true);

/// <summary>Called when a touch event moves.</summary>
/// <param name="touches">Touches.</param>
/// <param name="e">The touch event.</param>
public override void TouchesMoved(NSSet touches, UIEvent e)
    UITouch touch = (UITouch)touches.AnyObject;
    _lastTouch = touch.LocationInView(this);
    _screenManager.Screen.HandleTouch(new Point(_firstTouch.X, _firstTouch.Y) * _screenScale, new Point(_lastTouch.X, _lastTouch.Y) * _screenScale, false);

All of the above code happens in my iOS specific EAGLView class and the rest of my code needed no adjustments.

Next: coming soon...


Alpha layering

In a recent post I discussed premultiplied alpha in textures. After that I thought I had my alphas under control, but it turns out I was wrong.

Keeping score

Early in my game development I came up with a slightly interesting way to render text. In the game, text is used only to show the game score and is generally quite large on the screen. The score advances every frame, or 1/60th of a second. Because that advancement would actually change too quickly for the eye to see I actually truncate the last digit and only render the remaining ones. So if the score advances frame-by-frame as 137 → 138 → 139 → 140 → 141, I would actually render it as 13 → 13 → 13 → 14 → 14. Even doing this the rendered score visibly advances every 10 frames, or every 1/6th of a second. This is still pretty rapid, and I wanted to visually smooth that quickly changing score when it's rendered on the screen.

50 shades of grey

The idea I settled on was to keep track of the last N instances of the score each time a frame in rendered. In the above example, if N was 5, that would be the list of numbers shown. One frame later the list would change to 13 → 13 → 14 → 14 → 14, and 7 frames later it would become 14 → 14 → 14 → 14 → 15. After keeping track of these last N scores I then actually rendered them all with the oldest score being very transparent and each successive score being more and more opaque. This gives an effect like the one shown in the image to the right. Actually that image is showing the score rendered as it is also being rotated in order to make the separate layers easier to see, in most cases the score is not rotating, so each layer is stacked directly on top of the previous one.

This layered transparency effect worked out well and I've had it in place since quite early in the game development. The way I actually implemented this was to take my desired alpha and figure out how many fractional shades of that I would need such that when I layered them all together I would get my desired alpha. For example, for the score when it is shown during a game it is dimmed so as not to distract from game play and has an intended alpha of 0.25. After the game when showing the final score it is brightened to full opacity, or alpha 1.0. If I have 5 layers then I decided to break the desired alpha into 15 fractions 1 + 2 + 3 + 4 + 5. This is also \( \frac{N \times (N+1)}{2} \) or \( \frac{N^2 + N}{2} \). So, for the dimmed alpha of 0.25 I calculated the fractional alpha as \( \frac{0.25}{15} = 0.0167 \). Then I rendered the oldest score layer with an alpha of 0.0167, the second oldest as \( 0.0167 \times 2 = 0.0333 \), etc, up to the newest score as \( 0.0167 \times 5 = 0.0833 \). When all of these were layered on top of one another it produced the desired 0.25 alpha...or so I thought.

One problem I periodically ran into was that if I adjusted a parameter related to the transparency, like making the text slightly lighter or darker, or make the number of layers more or fewer, then the resulting aggregated color often ended up being not what I expected. Recently, and after improving my understanding of alpha blending as described in my previous post, I finally spent the time to dig into this and understand this properly. The problem was that my simple math was incorrect and does not represent how layered alphas work.

In the case of my text, I'm layering a pure white opaque texture onto some existing background. For the texture, being pure opaque white, each white pixel is (1, 1, 1, 1). Because I want to dim this, as described above, I adjust the alpha 0.0167 to get (1, 1, 1, 0.0167). To figure out how this blends with the background I return to the alpha formula I described in my previous post, which said regarding GL.BlendFunc(BlendingFactorSrc.One, BlendingFactorDest.OneMinusSrcAlpha);: for the source color use it as is; and for the destination color take the alpha component of the source, subtract that from one, and multiply that by each component of the destination.

Diminishing returns

Let's do the math. The source is (1, 1, 1, 0.0167) and to keep things simple the destination is black, or (0, 0, 1, 1). So, use the source component as is means just keep (1, 1, 1, 0.0167). Next, for the destination color, take the alpha component of the source, subtract that from one, and multiply that by each component of the destination means we take the source alpha, 0.0167, subtract that from 1, \( 1 - 0.0167 = 0.9833 \), and multiply that by the destination, which gives us (0, 0, 0, 0.9833). Now, the actual screen doesn't have an alpha component, it just shows colors, so to get a real color we multiply the alpha by each component and then add the results, or: (1, 1, 1, 0.0167) becomes (0.0167, 0.0167, 0.0167) and (0, 0, 0, 0.98333) becomes (0, 0, 0) and adding those two gives us (0.0167, 0.0167, 0.0167) or   X   . That's very close to black, but that's ok, it's just 1 of 15 shades we're applying. Let's do another.

The source is the same, (1, 1, 1, 0.0167), or multiplied out as (0.0167, 0.0167, 0.0167). The destination is different this time, since we layering on top of what we just calculated, which is (0.0167, 0.0167, 0.0167, 0.9833), or multiplied out as (0.0164, 0.0164, 0.0164). Adding these gives us (0.0331, 0.0331, 0.0331) or   X   . Hmm, that still looks awfully black, but it's just two of 15 shades.

Now, for those of you paying attention, this actually isn't the shade we'd have at this point. I'm supposed to take the fractional alpha for the oldest text plus two of that fraction for the next oldest, which means at this point I should have three shades layered on one another, which would be a tad brighter black. But, actually, even that isn't true, and this is where we start getting to the heart of my problem. You do not get to the same color if you apply the alpha layer three times in succession compared to doing it once and then a second layer at twice the alpha. Let me explain.
If I continued the above calculations for one more layer at the same fractional alpha I would end with a final color of (0.0492, 0.0492, 0.0492). On the other hand, if I took my first layer, (0.0167, 0.0167, 0.0167), and layered on top of it a double fractional shade (1, 1, 1, 0.0334), then I would result with (0.0497, 0.0497, 0.0497).

Now, 0.04918 and 0.04973 are not that different, but if I did this for all 15 layers the difference increases. Here's the complete table:

Shade Applied Cumulative Total Calculated Color
Layer Single Layers Increasing Shades Single Layers Increasing Shades Single Layers Increasing Shades
1 0.0167 0.0167 0.0167 0.0167 0.0167 0.0167
2 0.0167 0.0333 0.0333 0.0500 0.0331 0.0494
3 0.0167 0.0500 0.0500 0.1000 0.0492 0.0970
4 0.0167 0.0667 0.0667 0.1667 0.0650 0.1572
5 0.0167 0.0833 0.0833 0.2500 0.0806 0.2274
6 0.0167 0.1000 0.0959
7 0.0167 0.1167 0.1110
8 0.0167 0.1333 0.1258
9 0.0167 0.1500 0.1404
10 0.0167 0.1667 0.1547
11 0.0167 0.1833 0.1688
12 0.0167 0.2000 0.1826
13 0.0167 0.2167 0.1963
14 0.0167 0.2333 0.2097
15 0.0167 0.2500 0.2228

This shows two methods of adding the layers. Single Layers means layering the fractional alpha multiple times. Increasing Shades means doing that, but increasing the fraction for each layer. I've highlighted the interesting numbers. The Cumulative Total colums just show that in both methods the resulting simple-math total is what we're trying to get to, 0.25. The Calculated Color column shows what we actually end up with, which in both cases is less than our target, or 0.2228 in the case of Single Layers or 0.2274 in the case of Increasing Shades.

The reason both numbers don't add up to what we want is that each successive layer causes less of a change than the previous one. This is a case of diminishing returns. Because of this we need to do some math to calculate the fractional shade such that it takes into account these diminishing returns.

Calculated Color
Layer Single Layers Increasing Shades
1 0.0667 0.0667
2 0.1289 0.1911
3 0.1870 0.3529
4 0.2412 0.5255
5 0.2918 0.6836
6 0.3390
7 0.3830
8 0.4242
9 0.4626
10 0.4984
11 0.5318
12 0.5630
13 0.5922
14 0.6194
15 0.6447

Warning! Math ahead!

I'm going to shift our target to 1 instead of 0.25. This will make the diminishing returns problem more clear. Doing this, and following the same flawed layering strategy described above, the Calculated Color from the above table becomes as shown on the right. You can now see the totals are quite far for our target of 1. In fact, because of diminishing returns, no matter how many layers we add we'll never actually get to 100% opaque. If I extended the above to 100 layers the Single Layers color would be 0.999.

So, what I needed to do was to determine a different fractional alpha to use when layering that would add up to my desired target. To do this I converted my Single Layers iterative algorithm into a formula. First, the iterative version of this is below, where fractional is the fraction of my intended alpha, or \( \frac{1}{15} = 0.067 \).

$$ f(n) = (1 - fractional) \times f(n-1) + fractional $$

This calculates the final color for n where n is the number of layers. To convert this to a formula that does not reference itself I started by listing the successive values of this formula in Google Spreadsheets, just like the above tables, and then I played with it until I came up with the following equivalent formula.

$$ color = 1 - (1 - fractional)^{layers} $$

We can check this by plugging in our fractional alpha of \( \frac{1}{15} = 0.067 \) and our 15 layers as \( 1 - (1 - 0.067)^{15} = 0.645 \), which gives us the same value. So far so good, but what we want is a formula into which we supply the color and the number of layers and it gives us the fractional alpha. That means we want to get the fractional part of the formula on one side by itself. Time to remember how do to formula manipulation. First, get the exponent expression on a side by itself.

$$ 1 - color = (1 - fractional)^{layers} $$

Next, reverse the exponential equation.

$$ (1 - color)^{ \frac{1}{layers} } = 1 - fractional $$

Finally, get fractional by itself and positive.

$$ fractional = 1 - (1 - color)^{ \frac{1}{layers} } $$

Great, now we just plug in our target color, which was fully opaque white or 1, and keeping the layers as 15, we get...oops...hmm, \( 1 - 1 = 0 \) and \( 0^{ \frac{1}{15} } = 0 \), and \( 1 - 0 = 1 \), so the fractional is 1. Ack, we have a problem. This actually makes sense. Remember when I said that with the diminishing returns problem we can add the fractional alpha as many times as we want and never actually get to 100% opaque, well this is telling us the same thing. Actually, if the fractional alpha starts out as 1 then we'll get to 100% opaque, but then we don't get the successive alpha blending effect that we want.

Ok, time to cheat, although I want the final alpha to be 1 I'll settle with it being a single shade off of 1. Since there are 256 possible alpha shades I'll settle with \( \frac{255}{256} = 0.996 \). And plugging that into the formula I get \( 1 - (1 - 0.996)^{ \frac{1}{15} } = 0.308 \). If I layer that 15 times I get this succession of colors: 0.308 → 0.521 → 0.669 → 0.771 → 0.841 → 0.890 → 0.924 → 0.947 → 0.964 → 0.975 → 0.983 → 0.988 → 0.992 → 0.994 → 0.996.

This works and I've implemented this in code now, but it's not actually exactly what I want. This is the Single Layers algorithm, where really I want the Increasing Shades algorithm. The formula for that is more complex, though:

$$ f(n) = \left[ \left(1 - \frac{ n^2 + n }{2} \times fractional \right) \times f(n-1) \right] + \frac{ n^2 + n }{2} \times fractional $$

I've tried to convert this into a single formula that does not reference \( f(n-1) \), but have not been successful yet. For now I'll either stick with the previous formula or use the one above and calculate it recursively.

Next: Who stole my pixels?


A Tutorial; Step-by-step

I've been planning to add a tutorial to my game from the beginning, and finally got around to really working on it a few weeks ago. Similar to my previous post about simple solutions, I found that once I tinkered with ideas enough and came up with a clear idea of how to implement it, my tutorial code just fell into place quite smoothly and fit satisfyingly into the existing architecture.

Making a point

I decided that the best way to communicate how to play the game was to have an image of a finger moving on the screen indicating touch actions. You can see an example of this in the image to the right. I created the hand image by taking a photo of a hand (my eldest daughter's) and then manipulating it a bit to flatten the colors and such. In the tutorial it is opaque when touching the screen (the left image), or mostly transparent when not touching the screen (the right image).

After creating the finger I needed a way to move it around the screen easily and also initiate the touch events. For this I created a Finger class that extends Symbol (see my previous post for a discussion on this) and holds the image of the finger. This class also has new animation functionality mostly implemented in the methods shown below.

BeginAnimation is called once for each single animation step (e.g., moving from point A to point B with the finger touching the screen or not, as indicated). This animation is then handled as part of the normal Animate method, which is called once for each Widget during the main animation loop, by calling DoFingerAnimation. As you can see it mostly just updates the finger's position and, once complete, calls the _fingerCompletionAction.

public void BeginAnimation(Point start, Point finish, bool isTouchingScreen, TimeSpan duration, Func<bool> completionChecker, Action completionAction)
    _fingerMoveStartTime = DateTime.Now;
    _fingerMoveDuration = duration;
    AnimationStartPosition = start;
    AnimationFinishPosition = finish;
    IsTouchingScreen = isTouchingScreen;
    _fingerCompletionChecker = completionChecker;
    _fingerCompletionAction = completionAction;

private void DoFingerAnimation()
    TimeSpan elapsed = DateTime.Now - _fingerMoveStartTime;
    if(elapsed < _fingerMoveDuration)
        Position = Animator.Interpolate(
        AnimationCurrentPosition = Position;
    else if(IsBeingAnimated && _fingerCompletionChecker())
        _fingerMoveStartTime = DateTime.MinValue;

Stepping it up

So, now that I can perform a single step of an animation controlling the finger, I need to string these together into multiple steps that show a complete tutorial lesson. To do this I created a couple of data holders within my Tutorial class. The first, Step, represents a single step of the tutorial and performs a single finger animation movement. The second, Lesson, holds all of the data for a single tutorial lesson including the game elements to show on the screen and the sequence of steps.

One thing to note, there is a slightly confusing use of the term "completion checker" here, since it is used twice. It basically serves the same purpose for two different levels of the lesson. Inside Step it is used to determine if that step should end. Of course the step has a set duration, but even after that duration there can be other conditions that must be met (see the actual lesson examples later). Similarly, in Lesson this is used to determine if the lesson is complete.

private struct Step
    public Step(double destinationX, double destinationY, bool touchScreen, double duration, Func<Tutorial, bool> completionChecker)
        Finish = new Point((float)destinationX, (float)destinationY);
        TouchScreen = touchScreen;
        Duration = TimeSpan.FromSeconds(duration);
        CompletionChecker = completionChecker ?? (foo => { return true; });

    public Point Finish;
    public bool TouchScreen;
    public TimeSpan Duration;
    public Func<Tutorial, bool> CompletionChecker;

private struct Lesson
    public int Width;
    public int Height;
    public IEnumerable<Point> Goals;
    public IEnumerable<ShipInfo> Ships;
    public IEnumerable<Tuple<int, int, WallLocation>> Walls;
    public Func<Tutorial, bool> CompletionChecker;
    public IEnumerable<Step> Steps;

A lesson plan

Fortunately I was able to use a combination of the yield return technique for enumerations and C#'s object initializers to compactly define individual lessons. I do this statically and populate an array to hold them all.

private static Lesson[] All = Tutorials.ToArray();

private static IEnumerable<Lesson> Tutorials
        int width = 4;
        int height = 6;

        // Create one simple diagonal.
        yield return new Lesson
            Width = width,
            Height = height,
            Goals = new Point[] { new Point(width - 2, height - 2) },
            Ships = new ShipInfo[] { new ShipInfo(ShipType.Hero, new Point(0.5f, 0.5f), Direction.East, 0.025f) },
            Walls = new Tuple<int, int, WallLocation>[0],
            CompletionChecker = tutorial => { return tutorial.AchievedAllGoals; },
            Steps = new Step[] {
                new Step(width * 0.6, height * 0.4, false, 3, null),
                new Step(width - 2, 0, false, 2.5, null),
                new Step(width - 1, 1, true, 1.5, tutorial => { return tutorial.Ships.First().Position.X < width - 2.5; }),
                new Step(width - 0.5, 1.5, false, 1.5, null)

Pulling apart this first Lesson, the interesting part is the 3rd step that has the non-null completion check. This check ensures that the Ship is far enough to the left before taking the finger off of the screen, and therefore completing the diagonal. Without doing this the ship could end up on the wrong side of the diagonal and not bounce to where it is supposed to.

There are a number of interim lessons I'm not including here, but one interesting one (shown below) is the lesson showing how to pause the game, which is done by swiping all the way across the screen horizontally in either direction. The interesting part here is that I needed to show the pause symbol, and then the continue symbol. To do this I cheated a little in two ways. First, in the step before I want the appropriate symbol to be visible, I used the completion check to create the needed symbol, although it's alpha was initially set to 100% transparent. This is done via the CreatePauseSymbol and CreateContinueSymbol methods, not shown here. Second, also not shown here, I adjust the transparency of the pause symbol to become more opaque as the finger completes its animation. This was a little hackish, but worked out pretty well.

        // How to pause.
        yield return new Lesson
            Width = width,
            Height = height,
            Goals = new Point[] { new Point(width - 2, height - 2) },
            Ships = new ShipInfo[] { new ShipInfo(ShipType.Hero, new Point(0.5f, 0.5f), Direction.South, 0.045f) },
            Walls = new Tuple<int, int, WallLocation>[0],
            CompletionChecker = tutorial => { return true; },
            Steps = new Step[] {
                new Step(width * 0.6, height * 0.8, false, 2, null),
                new Step(0, height * 0.45, false, 1, tutorial => tutorial.CreatePauseSymbol()),
                new Step(width, height * 0.55, true, 3, tutorial => tutorial.CreateContinueSymbol()),
                new Step(width - 1.5f, height - 1.5f, false, 1.5, null),
                new Step(width * 0.5, height * 0.5, false, 1.5, null),
                new Step(width * 0.5, height * 0.5, true, 0.05, tutorial => tutorial.RemoveContinueSymbol()),
                new Step(width * 0.65, height * 0.65, false, 0.5, null),
                new Step(0, height - 2, false, 0.75, null),
                new Step(1, height - 1, true, 0.75, tutorial => { return tutorial.Ships.First().Position.Y < height - 2; }),
                new Step(width * 0.75, height * 0.55, false, 1, null),

Linking them together

Finally, now that the lessons are defined, I need to do two more things: make the finger's touch actually behave like a normal touch in a normal game, and queue up the steps so they play in order. The first was pretty easy by just calling the existing touch handlers with the appropriate touch points. The second also turned out particularly well because I used the built in onComplete action in the finger animations to recursively call a method that dequeues each successive step.

private void DoTutorialSteps(Queue<Step> queue)
    if(queue.Count > 0)
        Step step = queue.Dequeue();
        Action onComplete = step.TouchScreen
            ? new Action(() =>
                    HandleTouch(BoardToScreen(_finger.AnimationStartPosition), BoardToScreen(_finger.AnimationCurrentPosition), true);
                    base.HandleBoardTouch(_finger.AnimationStartPosition, _finger.AnimationCurrentPosition, true);
            : new Action(() => { DoTutorialSteps(queue); });
        _finger.BeginAnimation(_finger.Position, step.Finish, step.TouchScreen, step.Duration, () => step.CompletionCheck(this), onComplete);

I'm now almost done with the tutorial and have only one more lesson to add.

Next time: Alpha layering.


Simple Solutions

Unlike the in depth investigation and learning required to understand premultiplied alpha, as discussed in my previous post, today's topic is simple and satisfying.

Collections of collections

I've structured the graphical elements of my game into a number of logical layers (not display layers). I won't go into all of them, but to give you an idea, here are some of the key parts, from the most fundamental to the most complex:

  1. The most basic layer is a Texture, which is an abstract class that represents a picture that is loaded from a file.
  2. Above that is an OpenGLES20Texture, which is an implementation of Texture specifically for OpenGL ES 2.0.
  3. Above that is a Graphic, which includes information about the Texture, plus information about its orientation, size, and color.
  4. The next layer up is Widget, which is an abstract class that has a 2D position, an abstract Animate method, and an abstract Render method.
  5. One layer above that is Symbol, which implements Widget and provides a number of concrete features, including references to one or more Graphic instances. I use this class for all of the interactive controls in the game, like the checkmark to start a game, the question mark icon for starting the tutorial, etc.
  6. Another layer above Widget are all of the game UI elements, like Ship, which is the white ball that moves around and bounces off of walls. This also implements Widget and contains the logic to make the ship do what it is supposed to. I have similar classes for the other game UI elements like Wall, Diagonal, etc.

Given the above structure, this means that all graphical elements that I need to render inherit from Widget. In the various game screens I keep track of these in collections that inherit from the following:

public interface IWidgetEnumeration
    /// <summary>Returns the <see cref="Widget"/>s in the collection.</summary>
    IEnumerable<Widget> Widgets { get; }

For example, the game screen parent class collects all of these multiple collections via the following:

protected sealed override IEnumerable<IWidgetEnumeration> WidgetsToRender
        foreach(IWidgetEnumeration collection in LowerWidgetsToRender)
            yield return collection;
        yield return WallsHorizontal;
        yield return WallsVertical;
        yield return Goals;
        yield return Diagonals;
        yield return Poles;
        yield return _diagonalMarkers;
        yield return Ships;
        yield return Collisions;
        foreach(IWidgetEnumeration collection in UpperWidgetsToRender)
            yield return collection;
        yield return _pauseSymbols;

What's a little interesting here is that this accessor is an IEnumerable<IWidgetEnumeration>, or an enumeration of collections. This allows subclasses to override LowerWidgetsToRender and UpperWidgetsToRender to add additional widgets as necessary. What's been slightly annoying to me for a while, and what finally gets to the point of this blog entry, is that there have been a number of instances when I needed to only add a single new graphical element in a sub-class. But, since I need to return an IWidgetEnumeration I kept needing to create a collection to contain that single graphical element. This made the override of LowerWidgetsToRender look something like this:

protected override IEnumerable<IWidgetEnumeration> LowerWidgetsToRender
    get { yield return _lowerWidgets; }

Where _lowerWidgets is an IWidgetEnumeration that contains just one Widget. I couldn't just return the Widget directly, because I must return an IWidgetEnumeration. This seems inefficient to create this collection just to contain one element. But wait, IWidgetEnumeration is an interface. What's to stop me from implementing that directly on Widget so I can just return that directly? Well, that's exactly what I did. I made Widget implement IWidgetEnumeration and added the following simple bit of code to it.

public IEnumerable<Widget> Widgets
    get { yield return this; }

This allowed me to change the above LowerWidgetsToRender accessor into the following, where _tutorialCounter is the single Widget that was inside the previous collection.

protected override IEnumerable<IWidgetEnumeration> LowerWidgetsToRender
    get { yield return _tutorialCounter; }

I don't think this refactor improves the performance of the code in any measurable way, but it does make things a bit more straight forward and easier to understand. It's obviously not particularly clever or exciting, but I was happy when I thought of the solution and do feel the code is better because of it.

Next: A Tutorial; Step-by-step.


Premultiplied alpha

Sorry I've been gone

It's been a while since my last post about OpenGL and OpenTK Fun. This is because the challenge I described in that article, and ultimately resolving that issue, unstuck me from making real progress on my development and I've been putting much more effort into my game since then and am now much further along. That's all good news, but now I need to both make progress on the game and try to keep this blog going too.

I'll start today with a discussion about OpenGL premultiplied alpha textures, but before that I want to send a quick thank you out to a new code syntax highlighter by Alex Gorbatchev. It's JavaScript based, so I no longer need to manipulate my code entries before posting them in these articles. Also a thank you to Carter Cole and his blog entry describing how to easily setup the syntax highlighting in Blogger.

So, what is "premultiplied alpha" and why do I care?

As for what it is, there are plenty of pages out there that can describe it much better that me, so I'm not even going to try. I found a good succinct description on Martin Stone's blog, which points to a more detailed description on Tom Forsyth's blog. Please check those pages out if you want to learn more.

As for why I care, that I can describe in more detail. Initially I knew almost nothing about premultiplied alpha. I had seen the term a few times, for example when compiling my app there is some log message mentioning that it is precompiling my PNG textures, but I never tried to understand that more since everything was working fine. The reason I started to care more, and look into it more, are due to a couple things.

First, there was always something bothering me about a portion of my OpenGL code. From the beginning when I got the code working on both iOS and in Windows I found that I had to have a platform specific check for texture blending:

// on iPhone this is needed to apply the color adjustment (on PC this is our normal setting)
if(colorAdjustment.Alpha != 1)
    GL.BlendFunc(BlendingFactorSrc.SrcAlpha, BlendingFactorDest.OneMinusSrcAlpha);
    GL.BlendFunc(BlendingFactorSrc.One, BlendingFactorDest.OneMinusSrcAlpha);

I put this code into place after trial and error. The #if MONOTOUCH section only compiles and runs on the MonoTouch framework, which means only on iOS. What I didn't understand was, given that OpenGL is supposed to be a consistent interface across platforms, why did I need to have this condition depending on the platform? All other code and image assets related to OpenGL textures and blending was the same between the two platforms, so why was this needed?

Well, the answer goes back to what I mentioned above about where I had previously heard about premultiplied alpha; the iOS compilation log message. What that message means is that my assumption that I'm using the same image assets (PNG images in my case) is not true. Although I have the same PNGs referenced in both the iOS project and Windows project, when the iOS version gets built the PNGs are adjusted to have alpha premultiplied.

So, why does that require the adjustment in GL.BlendFunc? Well, first we need to know what GL.BlendFunc does. The first parameter is how to use the incoming (source) colors when blending them with the existing (destination) pixels. The second parameter is how to adjust those destination pixels. There is ample documentation about this on many sites, so I won't go into all of the parameter options, but I will discuss the two that I was using. The first version, GL.BlendFunc(BlendingFactorSrc.SrcAlpha, BlendingFactorDest.OneMinusSrcAlpha); says two things:

  1. For the source color, take the alpha (or transparency) component and multiply it by each other component. For example, if the RGB color is (1, 0.8, 0.5)       then after the multiplication it would become (0.5, 0.4, 0.25)      . This is the same color, but darkened.
  2. For the destination color, take the alpha component of the source, subtract that from one, and multiply that by each component of the destination. In this case the alpha is 0.5, so subtracting that from 1 is also 0.5, which would be multiplied by the destination color. For example, if the existing destination RGB color was (0.2, 1, 1)       then after multiplying it would become (0.1, 0.5, 0.5)      . Again, the same color, but darkened.

After completing the above calculations on the source and destination colors then are blended by adding them together. That is (0.5, 0.4, 0.25) + (0.1, 0.5, 0.5) = (0.6, 0.9, 0.75)      . Looking at the two original colors you can see that this works and the resulting blended color is correct.

Ok then, what's up with the second version: GL.BlendFunc(BlendingFactorSrc.One, BlendingFactorDest.OneMinusSrcAlpha);? How does this change things? I won't go through all the steps in detail, but starting with the same two first colors you would get a final function of (1, 0.8, 0.5) + (0.1, 0.5, 0.5) = (1.1, 1.3, 1.0) or pure white, since each component exceeds the maximum value of one and is therefore limited to one. Clearly that is too bright and doesn't work work. So, why would you want to do that? Well, that's where premultiplied alpha comes in. The first version used BlendingFactorSrc.SrcAlpha, which multiplies the alpha by the color. And what do you think premultiplied alpha does? It did that exact same calculation when the texture asset was created (or built, in this case). This means that we don't need to do it again now while blending. Instead we use the color as is, which is what BlendingFactorSrc.One does.

So, final question, why do this premultiplied alpha in the first place? I'll quote Tom Forsyth from his blog post (referenced above) for a pretty good explanation. For a much more in depth discussion please read his whole post.

"Normal" alpha-blending munges together two physically separate effects - the amount of light this layer of rendering lets through from behind, and the amount of light this layer adds to the image. Instead, it keeps the two related - you can't have a surface that adds a bunch of light to a scene without it also masking off what is behind it. Physically, this makes no sense - just because you add a bunch of photons into a scene, doesn't mean all the other photons are blocked. Premultiplied alpha fixes this by keeping the two concepts separate - the blocking is done by the alpha channel, the addition of light is done by the colour channel. This is not just a neat concept, it's really useful in practice.

Doing premultiplied alpha on Windows

Ok, so now I understand what's going on, but how do I fix it for Windows since I don't have a build step modifying my PNG files? I did some research and it seems there are some tools that will convert a normal PNG to a premultiplied alpha PNG, specifically it seems GIMP will do this, although I didn't try it myself. I didn't really want to have to convert my existing assets each time I modified them, or complicate my Windows build process by using addition tools, so I made a code change to do the alpha multiplication at the time I load my PNGs. That's a somewhat wasteful operation, but it only happens once, and considering I don't have very many textures I felt it was a good and simple solution.

Below is the code that does this. You'll notice some strange BGRA to RGBA conversion as well. This is due to using the Bitmap class and it's how Windows works.

public static Texture LoadTexture(string fileName, Func<byte[], int, int, Texture> textureCreator)
    if(textureCreator == null)
        throw new ArgumentNullException("textureCreator");

    using(Bitmap image = new Bitmap(fileName))
        BitmapData bitmapData = image.LockBits(
            new Rectangle(0, 0, image.Width, image.Height),
        IntPtr rawData = bitmapData.Scan0;

        Texture texture;
            int length = image.Width * image.Height * 4;
            byte[] data = new byte[length];
            Marshal.Copy(rawData, data, 0, length);

            for(int i = 0; i < length; i += 4)
                float alpha = (float)data[i + 3] / 255;
                byte r = data[i + 2];
                byte g = data[i + 1];
                byte b = data[i + 0];
                data[i + 0] = MultiplyByAlpha(alpha, r);
                data[i + 1] = MultiplyByAlpha(alpha, g);
                data[i + 2] = MultiplyByAlpha(alpha, b);
                data[i + 3] = (byte)(alpha * 255);

            texture = textureCreator(data, image.Width, image.Height);

        return texture;

private static byte MultiplyByAlpha(float alpha, byte channel)
    return (byte)(channel * alpha);

After putting this code in place, so that my textures turn into premultiplied alpha data as I load them, I was able to simplify my original platform dependent code into just this:

GL.BlendFunc(colorAdjustment.Alpha != 1 ? BlendingFactorSrc.SrcAlpha : BlendingFactorSrc.One, BlendingFactorDest.OneMinusSrcAlpha);

Removing that platform specific code is a very minor improvement, but more than that I'm happy to understand all of this quite a bit more.

Next: Simple Solutions.