Drawing Blanks

Premature Optimization is a Prerequisite for Success

Distribution of GPS accuracy

The power of accurate observation is commonly called cynicism by those who have not got it.

George Bernard Shaw.

When we are measuring something, we expect the errors to be normally distributed. Because we expect the causes of the errors to be independent, and to have additive effect on the measurement.

When we are measuring the 2 coordinates of the position, we may naively assume that we are determining the latitude independently from the longitude, and expect a normal distribution for each coordinate. The resulting distribution of the distance from the true position would be the distribution of $r=\sqrt{x^2+y^2}$ where x and y are normal. This is called a “2D-normal” or “circular gaussian”. Here’s the shape of this distribution (mean=0, sd=1 for each coordinate):

It is instructive to see that most measurements fall around 1 standard deviation (per coordinate) away from the true position.

Even though the assumption that the coordinates are determined independently is naïve, this distribution is actually observed experimentally: http://users.erols.com/dlwilson/gpsacc.htm

Now, what about the “accuracy” value that your GPS receiver reports? It cannot know your true position, so it can only report an estimated error. How is it estimated? It uses the Dilution Of Precision (DOP) as the main component of accuracy. If you read the definition of DOP you’ll see that it’s in fact the volume of a 3D body. If we assume (naively ) that the linear dimensions of that body are normally distributed, we should expect the DOP distribution to be log-normal. We expect the log-normal distribution to arise whenever the causes have multiplicative effect on the outcome. Here’s the shape of the log-normal distribution:

And here is the actual distribution of GPS accuracy reported by cellphone GPS receivers (20,000 observations):

Does this look closer to the 2D-gauss or to the log-normal?

There is a simple tool for comparing distributions that is called a Q-Q plot (quantile-quantile plot). Let’s see how the QQ plots look…

The 2D-gauss is not a good fit…

And the log-normal is much-much better!

So the GPS accuracy reported by the cell phone receiver is distributed close to log-normally. Basically, in most cases, the accuracy value is the dilution of precision multiplied by a constant (the estimated range accuracy for the receiver, typically 6 meters or so).

Disclaimer: this analysis is very inaccurate. The measurements in my sample include ones when the GPS was forced to yield a fix when it wasn’t done acquiring satellites. That resulted in outliers with very low accuracy values.