Software Testing @ Facebook

This topic has always been of interest by testers who like to understand how some of the best companies perform testing in their organization. Based on my research and what's on internet here are some very interesting snippets. I will try to keep this updated as and when I hear more but at the same time will have a series of posts targeting the top few companies in the technology space.
We shall start with Facebook here:


  • Places huge emphasis on individual engineers making sure their changes are legit
  • Strong shame culture around making irresponsible changes to the site, the apps, etc. (see What are the roots of 'clowning' or 'clowny behavior' or 'clowntown' at Facebook? and to a lesser but tone-setting extent When people who work at Facebook say "clowntown", what do they mean?If you do something really bad (i.e. take the site down, kill the egress of the site in a significant way) you may get a photo of yourself posted to an internal group wearing a clown nose.
  • Put huge emphasis on "dog fooding" (see to the site for up to a week before the general user will see the changes. Every FB employee uses the site differently, which leads to surprisingly rich test coverage on its own.
  • Running a laundry list of automated PHP and JavaScript (via Jasmine) unit tests based on what stuff you're making changes to in the codebase.
  • Lint process that runs against all the changes an engineer is making. The lint process flags anti-patterns, known performance killers, bad style, and a lot more. Every change is linted, whether it's CSS, JS, or PHP. This prevents entire classes of bugs by looking for common bug causes like type coercion issues, for instance. It also helps prevent performance issues like box-shadow use in mobile browsers which is a pretty easy way to kill the performance of big pages.
  • WebDriver ( run site behavior tests like being able to post a status update or like a post. These tests help us make sure that changes that affect "glue code" (see, which is pretty hard to unit test, don't cause major issues on the site.
  • Engineers can also use a metrics gathering framework that measures the performance impact of their changes prior to committing their changes to the code base. This framework (which is crazy bad ass btw) allows an engineer to understand what effects their changes have in terms of request latency, memcache time, processor time, render time, etc.
  • There is also a swath of testing done manually by groups of Facebook employees who follow test protocols. The results (or I should say issues) uncovered by this manual testing are aggregated and delivered to the teams responsible for them as part of a constant feedback/iteration loop.
  • Overall, the priorties are speed of testing, criticality (yes it's not a word meh meh meh) of what we test, and integrating testing into every place where test results might be affected or might guide decision making.
  • How about a war story instead? read more on the links below
  • Automation@ Facebook
    • For our PHP code, we have a suite of a few thousand test classes using the PHPUnit framework. They range in complexity from simple true unit tests to large-scale integration tests that hit our production backend services. The PHPUnit tests are run both by developers as part of their workflow and continuously by an automated test runner on dedicated hardware. Our developer tools automatically use code coverage data to run tests that cover the outstanding edits in a developer sandbox, and a report of test results is automatically included in our code review tool when a patch is submitted for review.
    • For browser-based testing of our Web code, we use the Watir framework. We have Watir tests covering a range of the site's functionality, particularly focused on privacy—there are tons of "user X posts item Y and it should/shouldn't be visible to user Z" tests at the browser level. (Those privacy rules are, of course, also tested at a lower level, but the privacy implementation being rock-solid is a critical priority and warrants redundant test coverage.)
    • In addition to the fully automated Watir tests, we have semi-automated tests that use Watir so humans can avoid the drudgery of filling out form fields and pressing buttons to get through UI flows, but can still examine what's going on and validate that things look reasonable.
    • We're starting to use JSSpec for unit-testing JavaScript code, though that's still in its early stages at this point.
    • For backend services, we use a variety of test frameworks depending on the specifics of the services. Projects that we release as open source use open-source frameworks like Boost's test classes or JUnit. Projects that will never be released to the outside world can use those, or can use an internally-developed C++ test framework that integrates tightly with our build system. A few projects use project-specific test harnesses. Most of the backend services are tied into a continuous integration / build system that constantly runs the test suites against the latest source code and reports the results into the results database and the notification system.
    • HipHop has a similar continuous-integration system with the added twist that it not only runs its own unit tests, but also runs all the PHPUnit tests. These results are compared with the results from the same PHP code base run under the plain PHP interpreter to detect any differences in behavior.
  • Our test infrastructure records results in a database and sends out email notifications on failure with developer-tunable sensitivity (e.g., you can choose to not get a notification unless a test fails continuously for some amount of time, or to be notified the instant a single failure happens.) The user interface for our test result browser is integrated with our bug/task tracking system, making it really easy to associate test failures with open tasks.
  • A significant fraction of tests are "push-blocking"—that is, a test failure is potential grounds for holding up a release (this is at the discretion of the release engineer who is pushing the code in question out to production, but that person is fully empowered to stop the presses if need be.) Blocking a push is taken very seriously since we pride ourselves on our fast release turnaround
  • This youtube video might be also interesting for you.
    The major part of it is about (large scale) testing.
    "Tools for Continuous Integration at Google Scale" 


Also shared on LinkedIn:

Happy Testing until the next Testing@ post.


Popular posts from this blog

Trim / Remove spaces in Xpath?

Complete list of Serenity properties

XPATH for IE / internet explorer