The drama in trying to convert election PDFs to Spreadsheets

markessien | 716 points

Nice read. It's important to note

1.The 2020 protesters did not begin vandalizing property, but government infiltrated the protests by burning cars and maiming people.

2. The Obidient movement encompassed multiple sub movements of which a part of the #EndSARS was one of them. A vast majority of Peter Obi's supporters were not #EndSARS activists.

3. Elections in Nigeria are fraught with treacherous behavior so everyone suspects everything. It's important to be very careful with your communication. There is a lot of desperation in the land and so if in a position of information leverage, the responsible thing is to handle the privilege with care and transparency.

OoTheNigerian | 3 months ago

First of all, what a fantastic and inspiring read.

But, I'm left greatly confused -- the article never states whether this changed the result.

It says that halfway through counting Obi was in the lead, but nothing about when finished counting.

And when I look at the spreadsheet, the last row (#3380) appears to be the totals, which lists:

  APC     LP     PDP     NNPP
  149014  85748  329030  8305
Which shows LP (Obi) in third place, just like the official results.

So what point is the article trying to make at the end of the day? Or have I misunderstood the numbers?

crazygringo | 3 months ago

Checking one at random:

https://docs.google.com/spreadsheets/d/1HhV9iJxXTU9liAZPIDoM...

...shows 0s in the first row for all candidate parties. But the corresponding photo shows votes for all three:

https://inec-cvr-cache.s3.eu-west-1.amazonaws.com/cached/res...

I hope it's not a mistake and that there's some arcane law/technicality to explain it.

edit: another mistake on row 21, LP should get 25 but it was credited to NNPP:

https://docs.inecelectionresults.net/elections_prod/1292/sta...

djoldman | 3 months ago

So the bug where the first voting sheet shown to a user was from the same 10% of the photos turned out to be a feature, serving as a CAPTCHA of sorts to weed out the bad actors from the good.

If memory serves, some CAPTCHA techniques include showing two numbers to transcribe, where one’s value is already known. If that number is transcribed incorrectly, then the other number’s result isn’t used, and the CAPTCHA fails. Perhaps a similar technique may have also helped here?

MontagFTB | 3 months ago

Oh, and Mark didn't mention that Bola Ahmed Tinubu was indicted for heroin charges in the US in 2003, forfeited $460k & is just too old to run a democracy this size.

Atiku Abubakar (second candidate) was a former VP and the president he served under (Obasanjo) still insists the dude remains a monument to corruption.

There's been a coordinated campaign at all levels to rig this election massively and we saw voter intimidation, manipulation in broad daylight, and the acquiescence of foreign governments to it all.

churchill | 3 months ago

Wow what a cliffhanger, it sounds like they have to deal with the courts now. I hope we get an update

https://www.msn.com/en-us/news/world/opposition-files-petiti...

kevviiinn | 3 months ago

Is the access to the original photos open? It might be fit for a good Kaggle competition, although maybe a little too late for this current election.

mtrovo | 3 months ago

Incredible story.

Some more background: https://ng.usembassy.gov/nigerias-2023-elections/

davedx | 3 months ago

This would have been a good use for hn style shadow banning. Especially if they didn't publish the current tally, then the original easy to detect bots may have never realized you were on to them

dec0dedab0de | 3 months ago

I still don't understand how we ended up with PDF as sort of standard to archive data. PDF is already pretty bad for things like manuals but for things like spreadsheets we basically collect the data, then we destroy all the structure by putting it in into POF, and later on we painstakingly try to restore the data from PDF which is often almost impossible to do with accuracy.

It just shows that bad solutions often win.

rqtwteye | 3 months ago

This might be a sensitive question but I wonder if something like this would work in the United States? With all of the fears of election interference why not trust but verify?

redman25 | 3 months ago

This is some compelling writing. I know this has real life implications for real people so I hope it's not in poor taste to say it would make a good movie.

harvey9 | 3 months ago

More background. OP is an impressive entrepreneur! Massive kudos. https://markessien.com/projects/hotels-ng/

davedx | 3 months ago

I'd tried something like this with the Kenyan election but our setup was to use OCR (google cloud) -> text -> parse -> sqlite

We started late so the results were out when we finished but I think it'll be a good idea to develop software that can parse the PDF results and display them faster than the electoral bodies can. In Kenya, and Nigeria, the delays cause a lot of anxiety

tr33house | 3 months ago

Silly, you don't malcount the actual votes, you brainwash the population and pervert the process until they vote the way you want them to, like in the advanced first world democracies.

hoseja | 3 months ago

This was thrilling.

Sometimes, one person's bug is another person's feature :)

mattlutze | 3 months ago

Isn't things like this the reason that the UN provide election observers?

By spot checking just a random 100 votes are correctly tallied, you can be pretty sure the outcome of the election is legit in a > 10M voter country.

londons_explore | 3 months ago

I've done stuff like this semi manually. Use pdftotext to get the text tables out of the pdf, eyeball it and massage with emacs keyboard macros, and in some cases python scripts. It's not that big a deal but it is somewhat ad hoc.

I know that OCR software is able to read stuff like magazine articles and figure out column layout, embedded charts, etc. It's weird if is nothing to do that with a pdf. Maybe I'll look around or see if I can hack up something.

throwaway81523 | 3 months ago

That is a great job - well done from a grateful Nigerian.

hardlianotion | 3 months ago

Elupee 75, To be frank, you did a great job and i am proud of someone from my country pulling this off, but the bitter truth is President Elect Bola Ahmed Tinibu won this election. Peter Obi's youth support is predominantly in the south, and Christian majority parts of the country, he clearly lack support in the Muslim north, where I am from. I voted for Kwankwaso though.

mmmuhd | 3 months ago

>> We had a brainstorming meeting, and decided to try a new approach. We would simply ask the Obidients to help us do the conversion. If hundreds of Obidients did the transcription, it would go fast.

What would guarantee that the Obidients would not, in turn, try to inflate the score of the Labor candidate?

YeGoblynQueenne | 3 months ago

Wow, this was a fantastic read!

I have no idea what’s going on in Nigeria, but I hope the truth (whatever it is) will prevail!

seventytwo | 3 months ago

This is a great example of why electronic voting is important and can help secure democracy.

nivenkos | 3 months ago

> Then ominously, on the 20th of October of 2020 some people drove there in unmarked cars and removed all the Cameras installed at the tollgate.

They at least capture some photos of the equipment. I wonder if anyone communicated with the individuals.

jxramos | 3 months ago

Striking reminder of how big the world is that while I had heard of #EndSARS, I hadn't realised the scale of the political violence in Nigeria nor that it had its own Bloody Sunday-scale massacre.

pjc50 | 3 months ago

The votes surprise me... In many regions one party gets 90+% of the vote.

Assuming the numbers are correct, then it suggests that most people are easily swayed by their local peers.

Is that common in say the USA?

londons_explore | 3 months ago

What was the final result numbers from the transcription?

blntechie | 3 months ago

Fantastic story. What an excellent example of democratization from technology. And also a perfect example of how the blade cuts both ways. Digital warriors battling it out in real time and the stakes are enormous. Great respect for Mark and his ingenuity and adaptive responses!!!!

thread_id | 3 months ago

I'm impressed by the courage of the protesters here, and the tenacity of the youth voters.

I hope they get a clear answer and a fair count, and whether they win this time or not, a real shot at cracking up their corrupt, two-party system.

pxc | 3 months ago

Is it true that USA does not have a open data law to make everybody publish in CSV?

neves | 3 months ago

Pdf is a very unfortunate format. It is proprietary, it is paper-oriented, its almost single goal is to keep precise printing layout. But for the last 30 years world didn't come up with anything that could compete.

SergeAx | 3 months ago

Fantastic story! Did the results get used in a claim?

orf | 3 months ago

I hope for (but do not expect) a positive outcome

vincheezel | 3 months ago

The context should be dated to 2020, not 2023 Edit: it was now corrected, no need to downvote

Great story! Looking forward to some follow up

jgtrosh | 3 months ago

Wow. Wild story. Thanks for sharing. Cool twist that a bug ended up identifying the bad guys.

dejongh | 3 months ago

What an exceptional story. You are a legend.

clipper_janosch | 3 months ago

What a scam by the ruling political party

prhrb | 3 months ago

The people who cast the votes don't decide an election, the people who count the votes do. - Stalin.

roschdal | 3 months ago

Not providing CSV is at the level of criminal negligence.

snvzz | 3 months ago

-

churchill | 3 months ago

[flagged]

favaq | 3 months ago