The Voting Information Project and Ballot Information Project had to consolidate epic amounts of data: address points, precinct information, localities, etc. A few days ago, I showed Nat some of my code used to dedupe this data through the election. He noticed some potential for performance improvements and after a few days of treating this as a small side project to check into, we have run a series of benchmarks on space used, code efficiency, dug into the Python C set implementation, and
Continue Reading >>