Michael Stapelberg’s Debian Blog
Debian Code Search: improving client-side latency (2016-08-08)
A while ago, it occurred to me that querying Debian Code Search seemed slow, which surprised me because I previously spent quite some effort on making it faster, see Debian Code Search Instant and Taming the latency tail for the most recent substantial architecture overhaul and related optimizations.
Upon taking a closer look, I realized that while performing the search query on the server side was pretty fast, the perceived slowness was due to the client side being slow. By “being slow”, I mean that it took a long time until something was drawn on the screen (high latency) and that what was happening on the screen was janky (stuttering, not smooth).
After this bit of original investigation, I opened GitHub issue #69 to track the work on making Debian Code Search faster. In that issue, I captured how Chrome’s network inspector visualizes the work necessary to render the page:
A couple of quick wins
There are a couple of little fixes and improvements on which I’m not going to spend too much time on, but which I list for completeness anyway just in case they come in handy for a similar project of yours:
- Remove css @import, unnecessary because we minify the two stylesheets together already.
- Compress assets using the gzip-compatible Zöpfli algorithm (saves 3KB).
- Run SVG files through the
svgooptimizer (saves 8KB).
- Use highly compressed/post-processed woff/woff2 webfont versions (gains such as 57KB→14KB for Roboto-Bold).
- Enabled HTTP/2 in nginx — not a win by itself, but avoids further round trips in combination with the EventSource API (see below).
All CSS which is required for the initial page rendering is now inlined in the responses, allowing the browser to immediately render a response without requiring any additional round trips.
All non-essential CSS has been moved into a separate CSS file which is loaded asynchronously. This is done using a pattern like
<link rel="preload" href="foo.css" as="style" onload="this.rel='stylesheet'"> , see also filamentgroup/loadCSS .
We switched from WebSockets to the EventSource API because the former is not compatible with HTTP/2, whereas the latter is. This removes a round trip and some custom code for WebSocket reconnecting, because EventSource does that for you.
The progress bar animation used to animate the
background-position property. It turns out that browsers can only animate the
opacity properties smoothly, because such animations can be off-loaded to the GPU. Hence, we have re-implemented the progress bar animation using the
The biggest win for improving client-side latency from the Chrome address bar was introducing Service Workers (see commit 7f31aef402cb782056e290a797f224171f4af270 ). Our Service Worker caches static assets and a placeholder results page. The placeholder page is presented immediately when you start a search (e.g. from the address bar), making the first response immediate, i.e. rendered within 100ms. Having assets and the result page out of the way, the first round trip is used for actually doing the search, removing all unnecessary overhead.
With all of these improvements in place, rendering latency goes down from half a second to well under 100 ms, and this is what the Chrome network inspector looks like: