Saturday, November 17, 2012

The Unmitigated Disaster Known As Project ORCA

 
 


=============================================


When the Nerds Go Marching In
A GREAT article about the Obama campaign's electronic 'ground game', Narwhal, how they put it together, tested it to fail multiple times, and made it work. Compared to Team Mitten's Orca (which FAILED miserably), this is a great example of how things SHOULD be done. EVERY person with any interest in being involved in a campaign in the future SHOULD READ THIS ARTICLE!

Remember that Orca hadn't really been tested until election day? Here's what the Narwhal team did 17 days before the election: "We worked through every possible disaster situation," Reed said. "We did three actual all-day sessions of destroying everything we had built."

Money quote: "We knew what to do," Reed maintained, no matter what the scenario was. "We had a runbook that said if this happens, you do this, this, and this. They did not do that with Orca."

No, they didn't, and it showed. Yet another Team Romney FAIL!
============================

The Obama campaign's technologists were tense and tired. It was game day and everything was going wrong.
Josh Thayer, the lead engineer of Narwhal, had just been informed that they'd lost another one of the services powering their software. That was bad: Narwhal was the code name for the data platform that underpinned the campaign and let it track voters and volunteers. If it broke, so would everything else.
They were talking with people at Amazon Web Services, but all they knew was that they had packet loss. Earlier that day, they lost their databases, their East Coast servers, and their memcache clusters. Thayer was ready to kill Nick Hatch, a DevOps engineer who was the official bearer of bad news. Another of their vendors, PalominoDB, was fixing databases, but needed to rebuild the replicas. It was going to take time, Hatch said. They didn't have time.
They'd been working 14-hour days, six or seven days a week, trying to reelect the president, and now everything had been broken at just the wrong time. It was like someone had written a Murphy's Law algorithm and deployed it at scale.
They'd been working 14-hour days, six or seven days a week, trying to reelect the president, and now everything had been broken at just the wrong time.
And that was the point. "Game day" was October 21. The election was still 17 days away, and this was a live action role playing (LARPing!) exercise that the campaign's chief technology officer, Harper Reed, was inflicting on his team. "We worked through every possible disaster situation," Reed said. "We did three actual all-day sessions of destroying everything we had built."


1 comment: