Every startup wants to be quick. First-mover advantage is viewed as key to success, so engineering teams avoid implementing best practices at the outset in an attempt to be more agile. This may work the first time when they develop the prototype, and maybe the second time for an alpha version. Right around this point, most startups get mired in quicksand – code quality becomes a major issue as small changes start causing problems.
Moving quickly and maintaining high quality code is a tricky balancing act. It’s as much art as science. It’s especially important if you’re writing software for large distributed systems; you must get the processes right at the start. Otherwise, you’ll end up with a software product that’s like a house where you paint the living room and the chimney falls off.
We want to be fast and nimble like every other startup, but we also build large distributed system software. Code quality is hugely important when you’re writing software to support thousand-plus node production clusters at some of the largest companies. It is extremely difficult to maintain velocity while developing code for large-scale systems – at least without adopting the right process. At Pepperdata, we’re constantly making changes to the code.
When Pepperdata started out, we each had 15+ years of experience in big and small software companies. We’ve all seen great products go on to fail, thanks to software bloat. Engineers waste hours over complex and confusing source code. Changes are made with no one understanding the effect. Production crashes. More and more time is wasted in blind QA. People write work-around changes. Code gets even more confusing. It’s a dead end.
How do you avoid it? We decided to adopt a few best practices and continue to follow them even as our engineering team has quadrupled in size; we are very happy with the result. The magic is “review, test, and refactor. Now repeat.”
Every change is reviewed.
We take code reviews seriously. Even one-line changes to comment strings get reviewed. We allow a change to be checked in only after reviewers approve it. Reviewers demand the change to be as clean as possible, and as readable as possible. Code review guarantees that at least 2+ people read and understood the change. This significantly increases the chance that it can be understood by the rest. Over time, the difference in code quality is like day and night.
We use Review Board hosted on RBCommons, with our Mercurial code repository hosted on Bitbucket.
Changes must have unit tests.
Unit tests are an integral part of a code change. Reviewers will reject a change if it lacks unit tests. Each public method of a class expects a unit test exercising the method. Are there multiple code paths? Then each code path should have a test. Error paths should be tested too. We don’t like surprises hidden in our code. With unit tests, all obvious bugs get shaken out.
We use JUnit and Mockito for our Java code, PyUnit for Python, and Mocha for Javascript. Our Jenkins builds generate test coverage reports using Jacoco, Coverage.py and Blanket.
Coverage reports are created even before a change is checked in and posted to the review thread so that reviewers can look at the report while reviewing the change. Here’s an example of a build robot posting coverage and other information in the review thread:
Happy refactoring!
At Pepperdata, we encourage people to keep it simple in the first version. Just focus on the basic functionality, rather than over-designing for the future. We suggest to refactor it later. Why? Because we really can refactor it later.
Code review leads to readable and clean code. Unit tests act as specification of detailed behaviors. Now, you make a sweeping refactoring change with clear understanding. If you’re right, the unit tests will concur. Otherwise, they will point to your mistakes. Great, you get to correct them right away.
Now, you can be fast up front and continue to be fast all the time!
Lessons learned:
Sounds easy, right? Not really. This process needs faith and team effort. Not a lot of companies get it right. We’re happy that we have.
Some lessons we’ve learned:
- Don’t rush an individual change; take the time to make sure that the code is clean and to add meaningful unit tests. As the legendary UCLA basketball coach John Wooden said, “Be quick, but don’t hurry.”
- Make sure that reviews have short turnaround times. At Pepperdata, most changes start receiving comments within 90 minutes of submission.
- We keep review threads open to everyone. In our daily stand-up, we also share what each person is working on. This allows people to jump on relevant reviews even if they are not the designated reviewer. It also helps reviews be transparent and fair. We also encourage group discussions about changes as needed.
- When a new person joins the team, make sure she has a chance to see the benefits of our process. We let the new person write a few simple changes and send them for reviews. Almost always, she is pleasantly surprised how easy it is to read and change our code. And most review comments are about making her own changes as easy to read. Seeing the benefits right away, she adopts this process on her own.
At first, doing things right might seem to slow down everyone. And everyone needs to multi-task a bit. But as a team, we move really fast because we don’t get blocked by unreadable code.
What else?
We have other help. We put together a pretty decent toolchain. We automate many things, so engineers don’t get bored with repetitive work. We invest a lot on writing integration tests and release tests. (And that code has its own unit tests, which we code review. Of course!) We’ll talk about some of the specifics in a future post.