I'm trying to count the number of first, second, and third choice votes for each candidate of an election (specifically the Stack Overflow 2014 Moderator Election). I downloaded the data file and opened it up. Based on my rudimentary interpretation of the file format I wrote a PHP script to count the votes:
<?php
$lines = file("stackoverflow-com-2014-election-results.blt");
unset($lines[0]);
$ballots = 0;
$first = array();
$second = array();
$third = array();
for ($i = 1;; $i++) {
$line = explode(" ", trim($lines[$i]));
if ($line[0] != 1) break;
$ballots++;
@$first[$line[1]]++;
@$second[$line[2]]++;
@$third[$line[3]]++;
}
$names = array();
for ($i++; $i < count($lines); $i++) {
$names[count($names) + 1] = trim(trim($lines[$i]), '"');
}
printf("%20s%8s%8s%8s%8s\n", "Name", "1st", "2nd", "3rd", "Total");
print(str_repeat("-", 52) . "\n");
foreach ($names as $id => $name) {
printf("%20s%8s%8s%8s%8s\n", $name,
$first[$id], $second[$id], $third[$id],
$first[$id] + $second[$id] + $third[$id]);
}
print(str_repeat("-", 52) . "\n");
printf("Ballots: %d\n", $ballots);
When I run it at the command line it prints this table:
Name 1st 2nd 3rd Total
----------------------------------------------------
Undo 1358 1425 1814 4597
bluefeet 3352 3148 2287 8787
0x7fffffff 1932 2147 2159 6238
Bohemian 5678 2935 2415 11028
Jon Clements 1531 1527 1618 4676
Doorknob 1165 1720 1753 4638
Raghav Sood 1358 1565 1571 4494
Siddharth Rout 1732 1872 1866 5470
Matt 1381 1988 2009 5378
meagar 1903 2382 2881 7166
----------------------------------------------------
Ballots: 21571
My problem is I can't get this to match up to what OpenSTV says when I run it on the same file. The "count of first choices" are all slightly different:
Ballot file contains 21571 non-empty ballots.
Counting votes for Stack Overflow Moderator Election 2014 using Meek STV.
10 candidates running for 3 seats.
R|Undo |bluefeet |0x7fffffff |Bohemian |Jon Clements
| | | | |
|--------------+--------------+--------------+--------------+--------------
|Doorknob |Raghav Sood |Siddharth Rout|Matt |meagar
| | | | |
|--------------+--------------+--------------+--------------+--------------
|Exhausted |Surplus |Threshold
| | |
=============================================================================
1| 1379.000000| 3372.000000| 1951.000000| 5707.000000| 1545.000000
| 1181.000000| 1375.000000| 1749.000000| 1389.000000| 1923.000000
| 0.000000| 314.249999| 5392.750001
|--------------------------------------------------------------------------
| Count of first choices. Candidate Bohemian has reached the threshold and
| is elected.
=============================================================================
[...]
What am I doing wrong? Or what is OpenSTV doing differently?
Update: My script was broken because it didn't take into account some rows which were encoded with second or third choices without prior choices being set. I'm guessing this was caused by users in the election sometimes deselecting prior choices: After having selected two candidates, deselecting the first choice candidate should treat the only remaining selected candidate as the user's first choice.
Fixed version:
<?php
$lines = file("stackoverflow-com-2014-election-results.blt");
$line = explode(" ", trim($lines[0]));
$numCandidates = $line[0];
$numChoices = $line[1];
$choiceVotes = array_fill(1, $numChoices, array_fill(1, $numCandidates, 0));
$totalBallots = 0;
for ($i = 1;; $i++) {
$line = explode(" ", trim($lines[$i]));
if ($line[0] == 0) break;
$totalBallots++;
for ($j = 1, $k = 1; $j <= $numChoices; $j++) {
if ($line[$j] != 0) $choiceVotes[$k++][$line[$j]]++;
}
}
$names = array();
for ($j = 1; $j <= $numCandidates; $j++) {
$names[$j] = trim(trim($lines[$j + $i]), '"');
}
$rowFormat = "%20s" . str_repeat("%8s", $numChoices) . "%8s\n";
$separator = str_repeat("-", 20 + (8 * $numChoices) + 8) . "\n";
$row = array("Name");
for ($i = 1; $i <= $numChoices; $i++) $row[] = $i . gmdate('S', $i * 86400 - 1);
$row[] = "Total";
vprintf($rowFormat, $row);
print $separator;
foreach ($names as $id => $name) {
$row = array($name);
$candidateTotal = 0;
for ($i = 1; $i <= $numChoices; $i++) {
$votes = $choiceVotes[$i][$id];
$row[] = $votes;
$candidateTotal += $votes;
}
$row[] = $candidateTotal;
vprintf($rowFormat, $row);
}
print $separator;
printf("Ballots: %d\n", $totalBallots);