Drawing Blanks

Premature Optimization is a Prerequisite for Success

Russian Elections: dissecting the data

leave a comment »

At the end of the previous post I demonstrated that all the voting precincts can be separated into 3 categories with very distinct statistics. Now I’d like to give it another try. First, why 3 categories? Look at the plots below (I presented them in the previous post too) – turnout ratio by precinct size and United Russia vote ratio by precinct size:

turnout-sizeUR-size

3 clusters are clearly seen: 1 – the upper-left corner: small precincts with very high turnout and very high support for UR; 2 – small precincts with turnout about 75% and about 50% votes for UR; 3 – larger precincts with TO around 50% and UR vote about 30%. The presence of these clusters introduces very strong correlations between precinct size and everything else. That makes it difficult to look at the distribution of vote and the correlation between the vote and turnout (that everyone is so excited about) in the whole general population. So here I’m breaking it down into 3 categories:

  1. C1 (14% of counted votes): the top-left corner is apparently very well correlated with geography. Those are ethnic outskirts: Bashkortostan, Dagestan, Ingush, Kabardino-Balkar, Karachaevo-Cherkess, Mordovia, North Ossetia, Tatarstan, Tyva, Chechnya. (This identification is approximate).
  2. C2 (13% of counted votes): precincts with less than 800 registered voters outside of C1.
  3. C3 (72% of counted votes): precincts with 800 to 3000 registered voters outside of C1.

(the remaining 1% is the Occupy Movement are very large precincts, mostly embassies in foreign countries, they are spread more or less uniformly by all variables)

Below are some graphs for each category with some funny comments. I’m not plotting vote-turnout correlations because that is what I’m planning to discuss in the next post.

C1

Vote distribution, turnout distribution:

C1votedistC1TOdist

Vote by district code, vote vs. precinct size:

C1votescatterC1vote-size

Well, this category is truly special. But it’s still inhomogeneous. The vote distribution has at least 4 modes and they are geographically clustered. The 75% peak in Tatarstan is terrible. I’m not going to look at this category any further.

C2

Vote distribution, turnout distribution:

C2votedistC2TOdist

Woo-hoo, gaussians! No, just kidding, they are not. But they are smooth, aren’t they Winking smileThe peaks on the vote percentage are mostly due to small precincts. The peak at 100% perfect turnout deserves a separate study.

Vote by district code, vote vs. precinct size:

C2votescatterC2vote-size

Inhomogeneity/Clustering is still there, so the “gaussian” is just an illusion. It’s still a mix of a few distributions with close modes and large deviations. And the vote-size correlation is virtually gone compared to the general population.

C3

Vote distribution, turnout distribution:

C3votedistC3TOdist

A 100% turnout again, at larger precincts! I’ll have to look into this…

Vote by district code, vote vs. precinct size:

c3votescatterC3vote-size

Look at those northern lights. Clustering within clustering. Almost like a fractal. Maybe it is fractal? Weibull distribution anyone? It looks like Weibull. But I know nothing about Weibull except it may look like this and has to do with fractal subdivision processes. I’ll need to look into this too…

As to the vote-size correlation – size still matters. But at least the turnout-size correlation is gone.

Now that I have more or less “smooth” samples C2 and C3 I’m going to move on to looking at vote-turnout correlation issue.

Written by bbzippo

01/04/2012 at 5:25 am

Posted in Uncategorized

Tagged with

Leave a comment