Designing a code-review tool, Part 1

We've just rolled out a new software tool for managing our code review process. Code review is a pretty central part of how we try to maintain a high level of quality and safety for our critical software systems, and so a code review management tool is an essential and long-overdue piece of infrastructure for us.

The new system is meant to facilitiate the basic code review process we've been using, and at the same time make it more flexible and scalable. Before this tool, our approach to reading code was pretty simple. We had a small set of reviewers who were responsible for reading every line of a handful of risk-critical systems. If a given codebase had never been reviewed before, then the code would be read from scratch. If what was being reviewed was a new release, then instead of reading from scratch, everyone would read diffs from the last reviewed checkpoint.

All of this was kept track of with simple manually updated log files. The only tool we really had was a program for generating PDF diffs. This system worked reasonably well when we were small, but it didn't scale as the amount of code review and the number of people involved grew.

I'll talk in a later post about the design we ended up settling on, but for now, I'll just go over some of what we wanted to achieve with our new tool.

  • Lightweight: We had adopted a practice of doing big-bang reviews. We'd work on a system for a while, and then when we were basically finished, we'd initiated a round of code review. More recently, it's become clearer to us that more frequent but smaller rounds of code review are perferable, for a variety of reasons. The tools need to make catching up on code review lightweight enough to be something that people would be willing to do every few days.
  • Granular: Our initial approach, which made sense when the codebase was small, was to have a handful of people review the entirety of the relevant systems. As our systems have grown larger, we've needed to switch to an approach where many different people are involved, with different people assigned to different subsystems. The code review tools need to support granular assignment of code review without making the interface too complex to use.
  • Hackable: We wanted the resulting system to store its data in a way that was easy to hack, when necessary. We knew that there would be times where we wanted to edit history to fix some obscure problem or other, and we also wanted to be able to maintain a clear and complete history of what we've done.

Next time, I'll talk a bit more about where these requirements led us.

Comments

Build vs Buy

Did your team consider integrating an existing piece of technology rather than starting from scratch? I'm not sure whether existing solutions like ReviewBoard would meet your team's exact needs, but an existing Open Source solution seems like it might be an attractive option to consider if your budget for developing new code is limited.

Build vs Buy

We did look at Review Board, and I read a little bit about what Guido wrote about Mondrian and Rietveld, and listened to his talk. But there were a few reasons that those other solutions didn't seem like a good fit for us. First of all, none of them seem to have first-class support for hg, which is what we use for version control. Support for hg is more than just a surface issue --- a key thing that we wanted to get right was doing code review in the branchy world that a DVCS encourages, as opposed to the more linear histories that show up in a system like subversion.

Beyond that, there just wasn't that much that these systems gave us that was hard to build on our own. We didn't really need a web-interface, and our model for tracking code review is different, to my eye, than what was natively supported in systems like ReviewBoard. I suspect that much of the work that we did would have needed to be added to any other system that we used, so the end result is that we'd have to write extensions to a complicated existing system in Python, rather than writing our own simple system in OCaml.

Another small disincentive to using these systems is that they're all web-based, and we're generally happier on the command line...

Syndicate content