One of the key decisions to make when designing a code review system
is choosing the basic unit of code review. One approach common in
many open-source projects is to be patch-centric, i.e. to make the
reviewable unit a single patch. In a patch-centric world, code review
is complete when every patch that went into the system has been read.
We instead decided to build tools that were diff-centric. In a
diff-centric world, reviews aren't done on single patches, but instead
between a pairs of points in history. Code review is complete when a path
of diffs has been completed, starting from a fully read revision, and ending in the revision to be approved.
Both patch-centric and diff-centric approaches are reasonable, but
we settled on diff-centrism for a couple of reasons:
- Merges complicate patch-based review. If you're going to do
code-review at a patch level, one question you need to answer is:
how do you review merges? In Mercurial, for instance, merges are
represented by changesets that have two parents, and it's not quite
clear how to read the resulting diffs -- you could read the diffs
from both parents, but this is largely a recapitulation of the
changesets being merged. One might be tempted to simply not review
merge nodes, since they typically don't contain material changes.
But merges aren't always trivial, so you ignore them at your peril.
- Reading many small patches reduces the signal-to-noise ratio.
People make coding mistakes. If you read every little patch that
someone has committed, you will end up reading the outcome of
decisions that were later reversed. Diff-centric code review allows you to
skip over some of that noise. If a patch is
committed that must later be backed-out, then someone who is reading patches will
end up reading both. In a diff-centric world, a reviewer who
reads a diff that crosses over both the original patch and the backout
won't have to review either.
Comments
xxdiff
Hi Yaron,
From 1998 to 2002, I worked in an environment where all code produced had to be submitted to a merge queue, and then reviewed by at least two independent developers before being merged into the main branch. We had an automated system to submit the patches (via ClearCase) and some scripts to review the changes.
I felt I needed a better tool to review diffs (in particular, I really wanted horizontal highlighting), so I wrote "xxdiff" as open source, which makes it much easier to read through patches, and integrated it into our the process at work. It has a lot of hidden features for merge reviewing and making comments on changes. Since then I've written a Python package to accompany it, that helps in creating small scripts around it to automate merge-reviews processes and other tasks that require interactive confirmation and diff-hunk selection.
Have a look, you might find it helpful in setting up your new process. Today it looks like an old geezer unix thing, but it still works great (I still use it daily). http://furius.ca/xxdiff/