Applied group theory
Imagine a simple image editor program that runs under strict resource constraints – on a mobile phone. (A friend of mine has been writing one). The user edits the picture in the preview mode, and after they are done, they click “apply”, and the program applies all the edits to the actual full-resolution image. The same approach is used in MS Photo Gallery (even though the Gallery is not resource-constrained, but rather resource-demanding).
Among other things, the app allows to perform the usual set of rotate/flip transforms. Namely, rotate by 90, 180, 270 degrees, flip horizontally and flip vertically. Obviously, if the user has performed a series of such transforms, you don’t want to apply them one-by-one. If they’ve made two 90 degree rotations, you want to apply them as one 180 degree rotation, and if they’ve done 2 horizontal flips you don’t want to do anything. So the problem is clear: how to simplify a series of rotations and flips? It’s obvious how to simplify rotations: you simply add the degrees modulo 360. Flips are even easier. But if you mix them, it becomes harder. Mirrors don’t commute with rotations. E.g. a 90 degree rotation followed by a vertical flip is not the same as a vertical flip followed by a 90 degree rotation.
My first thought was “No matter which group this is, we can represent these transforms by matrices”. Those would be really nice matrices that contain only 0s, 1s and –1s. You multiply them and obtain the resulting transformation. There are two problems with this. First, because of resource constrains, the program does not perform these transforms as coordinate transforms. It works by pushing arrays of bytes, or something like that. So knowing the matrix is not enough. We would need to know how to decompose that matrix into well-known fast transforms, namely rotations by multiples of Pi/2 and vertical/horizontal mirrors. And second, we don’t really need matrices to solve this problem.
Let’s see how many different transforms we can possibly get. The X axis can point in any of the 4 directions. For each position of the X axis, there are exactly 2 positions of the Y axis. Which gives us the total of 8 transformations. Yes, it’s a group of 8 elements, which is known as the group of symmetries of the square or D4. In addition to the 6 elements that I already mentioned, it also contains 2 diagonal flips. Those flips can be represented as a 90/270 degree rotate followed by a vertical flip. So all we need is to store the multiplication table for this group.
As a matter of fact, performance wise, the matrix approach is not worse that the multiplication table approach. I don’t really care to recommend one over the other in this particular case. What’s interesting is that this problem can be viewed as finding a representation of the group that has the least computational complexity, in some sense. A multiplication table of order n can be stored in n^2*log(n) bits. And that will allow to perform multiplication in O(1) time. In the particular case of the D4 group the table would require 192 bits. And the matrix representation would require to store 8 matrices each of which can be encoded in 8 bits. And multiplication will require some arithmetic (or bitwise manipulation), versus simple lookup.
I’m pretty sure that the problem of finding the “least complex” representation of the group has been studied, but I’ve never read anything on the subject.