Saturday, June 28, 2014

#LatencyTipOfTheDay: Median Server Response Time: The number that 99.9999999999% of page views can be worse than.

The math is simple:

The median server response time (MSRT) is measured per request. Pages have many requests.

% of page view that will see a response worse than the MSRT = (1 - (0.5 ^ N)) * 100%.

Where N is the number of [resource requests / objects / HTTP GETs] per page.

Plug in 42, and you get:

(1 - (0.5 ^ 42)) * 100% = 99.999999999977%

Why 42?

Well, beyond that being the obvious answer, it also happens to be the average number of resource requests per page across top sites according to Google's 4 year old stats, collected across billions of web pages at the time [1].  Things have changed since then, and the current number is much higher, but with anything over 50 resource requests per page, which is pretty much everything these days,  both my calculator and excel overflow with too many 9s, and say it's basically 100%. Since I figured 12 9s makes the point well enough, I didn't bother trying a big decimal calculator to compute the infinitesimal chance of someone actually seeing the median or better server response times for a page load in 2014...

[1] Sreeram Ramachandran: "Web metrics: Size and number of resources", May 2010.

Discussion Note:

It's been noted by a few people that this calculation assumes that there is no strong time-correlation of bad or good result. Which is absolutely true. This calculation is valid if every request has an even chance of experiencing a larger-than-median result regardless of what previous results have seen. A strong time correlation would decrease the number of pages that would see worse-than-median results (down to a theoretical 50% in "every response in hour 1 was faster that every result in hour 2" situations). Similarly, a strong time anti-correlation will increase the number of pages that would see a worse-than-median result up to a theoretical 100% (e.g. when every two consecutive response time lie on two opposite sides of the median).

So in reality, if there is some time correlation involved. My number of 9's may be exaggerated. Instead of 99.9999999999% of page views experiencing a response time worse than the median server response time, maybe it's "only" 99.9% of page views that are that really bad. ;-)

Without establishing actual time-correlation or anti-correlation information, the best you can do is act on the basic information at hand. And the only thing we know about the median in most systems (on its own, with no other measured information) 's that the chances of seeing a number above it is 50%.

No comments:

Post a Comment