Democratic societies are built around the principle of free and fair elections, that each citizen’s vote should count equal. National elections can be regarded as large-scale social experiments, where people are grouped into usually large numbers of electoral districts and vote according to their preferences. The large number of samples implies certain statistical consequences for the polling results which can be used to identify election irregularities. Using a suitable data collapse, we find that vote distributions of elections with alleged fraud show a kurtosis of hundred times more than normal elections. As an example we show that reported irregularities in the 2011 Duma election are indeed well explained by systematic ballot stuffing and develop a parametric model quantifying to which extent fraudulent mechanisms are present. We show that if specific statistical properties are present in an election, the results do not represent the will of the people. We formulate a parametric test detecting these statistical properties in election results. For demonstration the model is also applied to election outcomes of several other countries.
Free and fair elections are the cornerstone of every democratic society . A central characteristic of elections being free and fair is that each citizen’s vote counts equal. However, already Joseph Stalin believed that ”The people who cast the votes decide nothing. The people who count them decide everything.” How can it be distinguished whether an election outcome represents the will of the people or the will of the counters?
Elections are fascinating, large scale social experiments. A country is segmented into a usually large number of electoral districts. Each district represents a standardized experiment where each citizen articulates his/her political preference via a ballot. Despite differences in e.g. income levels, religions, ethnicities, etc. across the populations in these districts, outcomes of these experiments have been shown to follow certain universal statistical laws [2, 3]. Huge deviations from these expected distributions have been reported for the votes for United Russia, the winning party in the 2011 Duma election [4, 5].
In general, using an appropriate re-scaling of election data, the distributions of votes and turnout are approximately a Gaussian . Let Wi be the number of votes for the winning party and Ni the number of voters in electoral district i, then the logarithmic vote rate is νi = log Wi−Ni Wi
. In figure 2 we show the distribution of νi over all electoral districts. To ﬁrst order the data from different countries collapse to a Gaussian. Clearly the data for Russia and Uganda boldly fall out of line. Skewness and kurtosis are listed for each data-set in table SII, confirming these observations quantitatively.
Most strikingly, the kurtosis of the distributions for Russia (2003, 2007 and 2011) and Uganda deviate by two orders of magnitude from each other country. The only reasonable conclusion from this is that the voting results in Russia and Uganda are driven by other mechanisms or processes than other countries. However, such distributions only reveal part of the story, and a different representation of the data becomes helpful to gain a deeper understanding. Figure 1 shows a 2-d histogram of the number of electoral districts for a given fraction of voter turnout (x-axis) and for the percentage of votes for the winning party (y-axis). Results are shown for recent parliamentary elections in Austria, Finland, Russia, Spain, Switzerland, and the UK, and presidential elections in the USA and Uganda. Data was obtained from official election homepages of the respective countries, for more details and more election results,
see SOM. These figures can be interpreted as fingerprints of several processes and mechanisms leading to the overall election results. For Russia and Uganda the shape of these fingerprints are immediately seen to differ from the other countries. In particular there is a large number of districts (thousands) with a 100% percent turnout and at the same time a 100 % of votes for the winning party.
The shape of these irregularities can be understood with the assumption of the presence of the fraudulent action of ballot stuffing. This means that bundles of ballots with votes for one party are stuffed into the urns. Videos purportedly documenting these practices are openly available on online platforms [6–8]. In one case the urn is already filled with ballots before the elections start, e.g. , in other cases members of the election commission are caught filling out ballots, e.g. .
Yet in another case the pens in the polling stations are shown to be erasable, e.g. . Are these incidents non-representative exceptions or the rule?
develop a parametric model to quantify the extent of ballot stuffing for a given
party to explain the election fingerprints in figure 1. The distributions for
and extreme fraud. Incremental fraud means that with a given rate ballots for one party are added to the urn and votes for other parties are replaced. This occurs within a fraction fi of electoral districts. In the election fingerprints in figure 1 these districts are shifted to the upper right. Extreme fraud corresponds to reporting nearly all votes for a single party with an almost complete voter turnout. This happens in a fraction fe of districts, which form a second cluster near 100% turnout and votes for the incumbent party.
For simplicity in the model we assume that within each electoral district turnout and voter preferences follow a Gaussian distribution with the mean and standard deviation taken from the actual sample, see figure S2. With probability fi (fe) the incremental (extreme) fraud mechanisms are then applied. Note that if more detailed assumptions are made about possible mechanisms leading to large-scale heterogeneities in the data such as city/country differences in turnout (UK) or coast–non-coast (USA) (see SOM), this will have an effect on the estimate of fi. Figure 3 compares the observed and modelled fingerprint plots for the winning parties in Russia, Uganda and Switzerland. Model results are shown for fi = fe = 0 (fair elections) and for best fits to the data (see SOM) for fi and fe. To describe the smearing from the main peak to the upper right corner, an incremental fraud probability around fi = 0.64 is needed for the case of United Russia. This means fraud in about 64% of the districts. In the second peak around the 100% turnout scenario there are roughly 3,000 districts with a 100% of votes for United Russia representing an electorate of more than two million people. Best ﬁts yield fe = 0.05, i.e. five percent of all electoral districts experience extreme fraud. A more detailed comparison of the model performance for the Russian parliamentary elections of 2003, 2007 and 2011 is found in the figure S3. The fraud parameters for the Uganda data in figure 3 are fi = 0.45 and fe = 0.01.
The dimension of election irregularities can be visualized with the cumulative number of votes as a function of the turnout, figure 4. For each turnout level the total number of votes from districts with this, or lower turnouts are shown. Each curve corresponds to the respective election winner in a different country. Normally these cdfs level off and form a plateau from the party’s maximal vote count on. Again this is not the case for Russia and Uganda. Both show a boost phase of increased extreme fraud toward the right end of the distribution (red circles). Russia never even shows a tendency to form a plateau.
It is imperative to emphasize that the shape of the fingerprints in figure 1 will deviate from pure 2-d Gaussian distributions due to non-fraudulent mechanisms, such as heterogeneities in the population or voter mobilization, see SOM. However, these can under no circumstances explain the mode of extreme fraud. A bad forgery is the ultimate insult1.
It can be said with almost certainty that an election does not represent the will of the people, if a substantial fraction (fe) of districts reports a 100% turnout with almost all votes for a single party, and/or if any significant deviations from the sigmoid form in the cumulative distribution of votes versus turnout are observed. Another indicator of systematic fraudulent or irregular voting behaviour is a kurtosis of the logarithmic vote rate distribution of the order of several hundreds. Should such signals be detected it is tempting to invoke G.B. Shaw who held that democracy is a form of government that substitutes election by the incompetent many for appointment by the corrupt few.”
FIG. 1. Election fingerprints: 2-d histograms of the number of electoral districts for a given voter turnout (x-axis) and the percentage of votes (y-axis) for the winning party (or candidate) in recent elections from eight different countries (from left to right, top to bottom: Austria, Finland, Russia, Spain, Switzerland, Uganda, UK and USA) are shown. Colour represents the number of electoral districts. Districts usually cluster around a given turnout and voting level. In Uganda and Russia these clusters are ’smeared out’ to the upper right region of the plots, reaching a second peak at a 100% turnout and a 100% of votes (red circles). In Finland the main cluster is smeared out into two directions (indicative of voter mobilization due to controversies surrounding the True Finns). In the UK the fingerprint shows two clusters stemming from rural and urban areas (see SOM).
FIG. 3. Comparison of observed and modelled 2-d histograms for (top to bottom) Russia, Uganda and Switzerland. The left column shows the actual election fingerprints, the middle column shows a ﬁt with the fraud model. The column to the right shows the expected model outcome of fair elections (i.e. absence of fraudulent mechanisms fi = fe = 0). For Switzerland the fair and fitted model are almost the same.
The results for Russia and Uganda can only be explained by the model assuming a large number of fraudulent districts.
FIG. 4. The ballot stuffing mechanism can be visualized by considering the cumulative number of votes as a function of turnout. Each country’s election winner is represented by a curve which typically takes the shape of a sigmoid function reaching a plateau. In contrast to the other countries, Russia and Uganda do not tend to develop this plateau but instead show a pronounced increase (boost) close to complete turnout.
Both irregularities are indicative of the two ballot stuffing modes being present.
SUPPORTING ONLINE MATERIAL
The data Descriptive statistics and official sources of the election results are shown in table SI. The raw data will be made available for download at http://www.complex-systems.meduniwien.ac.at/. Each data set reports election results of parliamentary
(Austria, Finland, Russia, Spain, Switzerland and UK) or presidential (Uganda, USA) elections on district level. In the rare circumstances where electoral districts report more valid ballots than registered voters, we work with a turnout of 100%. With the exception of the US data, each country reports the number of registered voters and valid ballots for each party and district. For the US there is no exact data on the voting eligible population on district level, which was estimated to be the same as the population above 18 years, available at http://census.gov. Fingerprints for the 2000 US presidential elections are shown in figure S1 for both candidates for districts from the entire USA and Florida only. There are no irregularities discernible.
is separated into n electoral districts i, each having
an electorate of Ni people and in total Vi valid votes. The fraction of valid
votes for the winning party in district i is
denoted vi. The average turnout over all districts, ¯ a, is given by ¯ a =
1/nPi (Vi/Ni) with standard deviation sa, the mean
fraction of votes ¯ v for the winning party is ¯ v = 1/nPi vi with standard
deviation sv. The mean values ¯ a and ¯ v are
typically close to but not identical to the values which maximize the
empirical distribution function of turnout and votes over all districts. Let
v be the number of votes where the empirical distribution function assumes
its (first local) maximum (rounded to entire percents), see figure S2.
Similarly a is the turnout where the empirical distribution function of turnouts
ai takes its (first local) maximum. The distributions
for turnout and votes are extremely skewed to the right for Uganda and Russia
which also inflates the standard deviations in these countries, see table SI.
To account for this a ’left-sided’ (’right-sided’) mean deviation σLv (σRv ) from v is
introduced. σRv can be regarded as the
incremental fraud width, a measurable parameter quantifying how intense the
vote stuffing is. This contributes to the ’smearing out’ of the main peaks in
the election fingerprints, see figure
σRv =ph(vi − v)2ivi>v . (2)
Similarly the extreme fraud width σx can be estimated, i.e. the width of the peak around 100% votes. We found that σx = 0.075 describes all encountered vote distribution
FIG. 1. Turnout against percentage of votes for Bush (left column) and Gore (right) in the 2000 US presidential elections. Results are shown for all districts in the USA (top row) and for districts from Florida (bottom). There are no traces of fraudulent mechanisms discernible in the fingerprints.
(Traduction automatique, ne rêvez pas, je ne suis pas traducteur)