# Drawing Blanks

Premature Optimization is a Prerequisite for Success

## Russian elections and the social conformity model

UPDATE: I found a bug in my calculations which resulted in discarded data points. I don’t know how big is the impact. (I’d guess that the “Coleman Factors” below are inflated) Don’t take the graphs below at face value. I’ll update as soon as I can.  Bugs have been fixed.

UPDATE: I’ve posted a more rigorous version of this here. Everything in this post still holds though.

I mentioned Stephen Coleman’s social conformity model in the previous post. http://mpra.ub.uni-muenchen.de/14304/ Apparently, prior Russian elections were not the only tests of that theory. I found another work of Coleman where he demonstrates tests in many other elections. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.4615 Apparently, party support and turnout correlations have been observed more than once or twice in honest (e.g. U.S.) elections. The coolest example is the US Presidential Election of 1916: it shows that below the maximum turnout the correlation changes the sign. This can be noticed (although not apparent) in the 2011 Russian elections too, and was for some reason interpreted by many as “another indication of fraud”.

Coleman considers only larger entities (like states and districts) and not individual precincts in his tests. I tried to quickly apply his approach to the 2011 Russian data, with individual precincts in the 3 separate categories C1, C2 and C3 into which I divided all precincts.

Please note that the graphs below are not real fits – I picked the factors by hand. That shouldn’t matter too much though, I think. Behold “Party Entropies vs. Turnout” which according to Coleman can be modeled by the Turnout Entropy curve.

C3 (larger precincts). “Coleman Factor” CF=2.25 (I’ll explain later what it’s supposed to mean)

C2 (smaller precincts). CF=2.

C1 (ethnic outskirts) CF=2, and this data that otherwise looks terrible, fits the conformity model rather well.

Now, what does the CF mean? Coleman interprets it as the information content of the choice, i.e. the binary logarithm of the number of parties that people were really making choice from. The value of 2 in C1 goes completely against my intuition. I’d say, people there were choosing from 2 parties at most. But I might have miscalculated the entropies or misinterpreted Coleman’s theory.

The relative values of the factors do look realistic. Opposition was probably much better supported at the larger precincts which leveled the ground. And for general population it is probably true that people considered choice among 4 (out of total 7) parties.

And here is a combined plot with the 3 categories in different colors:

Need to take a closer look at C2: it’s an outlier and it has a cluster of outliers inside of it. I’m thinking it’s a mix of some very different things…

Conclusions: a) Playing with data is fun, b) Beware, playing with data is very addictive, c) No other conclusions.