State of web compatibility test automation at Mozilla

Datetime:2016-08-23 01:14:38         Topic: Automated Testing          Share        Original >>
Here to See The Original Article!!! of April 16th, 2015

When testing the compatibility of web sites and browsers, there’s lots of potential for automation. We can automate some of the work related to:

  • Discovery of problems
  • Triage of bug reports dealing with problems
  • Regression testing to check if former issues have re-surfaced when a website is updated

We can for example automatically identify certain problems that will cause a site to get classified as failed/incompatible. These include at least:

  • Flexbox and background gradient webkit stuff w/o equivalents in a selector that is applied on any of the pages we visit
  • Redirects to different sites based on User-Agent
  • JavaScript errors thrown in a browser we care about that do not occur with a different engine
  • For mobile content: Dependency on Flash or other plugin stuff w/o equivalents (i.e. OBJECT w/o VIDEO tag)
  • If a page has a video, it must play without errors
  • WAP served with a WAP-specific MIME type
  • Invalid XHTML causing parse errors
  • Custom written tests with a plug-in architecture to check for i.e. buggy versions of specific scripts

The same approach can be used to compare the behaviour of new versions and current releases, to have greater assurance that the update will not break the web and pinpoint risky regressions. Basically, we can use the Web (or at least a given subset of it) as our browser engine test suite.

Current implementations (AKA prototypes)

Site compat tester extension


  • Ran as Firefox extension
  • No longer developed
  • JSON-based description of tests (but with function expressions)
  • Regression testing only
  • Crude WAP detection



  • Runs tests with SlimerJS
  • Supports several test types for regression testing
    • WAP
    • Mixed content
    • Custom tests, bug-specific
  • Reads webcomptest JSON format.
  • Can track console errors.
  • Batch mode for regression tests (not as good for exploratory tests)
  • Can run both SlimerJS and PhantomJS (but not fully tested across both yet)


  • URL player based on Mozilla Marionette
  • Can spoof, load URLs, click, submit forms…
  • Generates, compares, splices screenshots that can be reviewed on ( example review ).
  • Records differences between code sent to different UAs, generates webcomptest JSON format.
  • Generated tests are reviewed and used with slimertester.js for regression testing.
  • Limited support for interactivity with sites: has code for automated login to services.


  • Based on Mozilla Marionette, can control browser on device (e.g. Flame phone).
  • Sets up web server that accepts commands.
  • Used to sync browsing actions on laptop and device - device loads same URL, clicks and scrolling is reproduced automatically (depends on some JS injected into the page to monitor your actions and send commands to the Python script, plus a proxy to forward the commands if the script doesn’t have cross-origin privileges).
  • Can check for problems, e.g. be told to check if a given element exists
  • Struggles with frames/iframes.
  • No recording of the results (yet)


  • Based on Mozilla Marionette, helps with bug triage
  • Accepts bug search URL as input. Goes through each bug, launches URLs automatically on device.
  • Interacts with bug trackers - can generate comments and add screenshots.
  • Finds contact points.
  • Does HTTP header checking.

Compatipede 1


  • Headless browser, based on GTK-Webkit. Runs only on *nix.
  • Batch operation over many URLs.
  • Plugin architecture makes it easy to add new “tests”.
  • Logic for finding CSS issues.
  • Resource scan feature tests included CSS and JS against a given regexp.
  • Somewhat “trigger-happy” in classifying sites as compat failures.
  • Logs results to MongoDB

Compatipede 2

Repository: none yet

Compatipede 2 is under development. Based on SlimerJS and PhantomJS, it simplifies comparisons across browser engines - not just spoofing as another browser but actually rendering pages with that engine.


Repository: Script for identifying CSS issues and suggesting fixes. CSS logic here is probably more refined than in Compatipede 1 (and written in JS, whereas the logic in Compatipede 1 is in Python). Should be compared / reviewed - has received some feedback and bug fixes from Daniel Holbert.


The goal is to develop a service that includes many of the best features from those prototypes.

Primary features

  • Run minimum two distinct browser engines, default to Gecko and WebKit (prototyped in Compatipede 2).
  • Define what binaries to use for both engines, enabling comparisons of Gecko.current and (unknown).
  • Set UA string (affecting both navigator.userAgent and HTTP) (, Compatipede 1, Compatipede 2, slimertester.js)
    • Set UA string separately per engine
  • Run explorative tests from a list of URLs including
    • comparisons of HTTP headers / redirects (from Compatipede 1)
    • analysis of applied CSS (from Compatipede 1, css-fixme.htm)
    • logging and analysis of JS errors (Rudimentary support in slimerjstester.js, no comparisons)
      • Ideally both those logged to the console and those caught by the page in try..catch.
  • Run regression tests described in the JSON(-like) format used by and slimertester.js.
  • Take screenshots (, Compatipede 2)
  • Enable easily adding new tests or statistics through a “plugin” architecture (Compatipede 1)
  • Resource scan (Compatipede 1)
  • Logging results to database (Compatipede 1)

Secondary features

  • Log existence of OBJECT, EMBED, AUDIO and VIDEO tags (none, but trivial via plugin APIs like Compatipede 1)
  • Discover WAP and XHTML MIME types and flag sites that send these to one UA but not another
  • Log in to sites automatically (
  • Screenshot comparison, flagging those with greater differences (
  • Write JSON files that can be used for regression testing. (
  • Look for contact points on web sites, e.g. direct links to “contact us” forms (
  • Bug search mode - give a link to bug tracker, it will scan all URLs in those bugs (
  • Tagging bugs automatically - for example to set “serversniff” and “contactready” in whiteboard when HTTP redirects differ (None)
  • Suggesting bug comments - for human review/cut’n’paste? ( - to some extent)

Given that most of these features already exist in various scripts that are useful prototypes for the final “Mozilla Compatipede” (or whatever we end up calling the project), it doesn’t seem overly ambitious to pull them together, refine them and create a really useful tool. However, there’s one more piece of the puzzle to consider - and it’s one we haven’t gotten right so far. Let’s call it..

Data usability

Some of our past efforts (with Compatipede 1 as perhaps the best example) failed because it’s easy to generate a lot of data, but hard to present parts of it in a way that’s useful and a context that’s relevant. Compatipede 1 can scan thousands of sites and generate megabytes of statistics. Our next-gen service will be even better at generating data. To make this useful, we need to spend considerable attention on the data presentation and data usability problems.

We should develop a service that, based on a couple of inputs like a host name and an optional User-Agent / browser name, returns known information (test results/statistics, links to screenshots). More importantly, we should develop an extension that will modify bug tracker issue pages, vet the information carefully, and present the most relevant parts of it (differences between engines, screenshots, contact URLs) right there in the bug. We use the bug trackers all the time - having carefully selected, relevant information presented right there for cut-and-paste into comments and analysis is going to make the data on sites with known bugs most useful. (I have written an experimental extension earlier, something more powerful and polished than that would work.)

Secondly, we need a tool similar to (likely based on) the screenshot review UI, but including all the information that indicates there is a problem on a web site we don’t have bugs for. Information reviewers will mark the difference as “not a problem”, mark it as related to an existing bug, or report a new issue.

(I used the surprisingly nice StackEdit markdown editor to draft this article. It deserves a link.)


Put your ads here, just $200 per month.