October 11, 2007

Comparing open source to Google Maps

October 11, 2007

This seems to me to be a significant question that needs to be answered (perhaps on a continual basis):

Can open source maps and crowdsourced maps such as OpenStreetMap compete with official corporate efforts such as Google Maps?

Let’s go back a little while and remember the stages that another, archetypal, example of crowdsourcing has gone through: Wikipedia. In terms of sheer content Wikipedia cannot be challenged. Here is a rough approximation of what it would take to print it out:

(This estimate is already a year old. The Encyclopedia Britannica is what, 12 vols? 24 vols?). (See update below).

But coverage is one part of it; many people worried about quality of the entries. So Nature went in and compared Wikipedia to the EB by asking a number of scientists to read entries in both and count the errors. Their conclusion? The error rates were similar (yes, EB has errors!). There’s a summary here on wikipedia itself. EB protested to the extent of taking out a half page ad in a British newspaper. Nature released more methodological details and re-examined their data. The result? They stood by their findings.

(The original article can be found here, subscription req’d.)

Charlie Sun recently had an insightful blog entry about all this as it applies to geospatial data. He concluded that the only thing preventing open source from competing is infrastructure (not data nor coverage).

The OpenStreetMap people have also done comparisons and the first thing that leaps out at the eye is that like the Encyclopedia Britannica, Google Maps contains errors. To whit (where each marker indicates an error):

There is also a capability to directly compare the OSM with GM:

(h/t: These last two images are taken from an interesting post at Mapperz.)

Why compare OSM to GM? The same reason wikipedia was compared to EB. It is a standard that people recognize and trust. If in fact OSM can be compared to GM and has the same error rate (this question is an open one as yet) then it would show that spatial data acquired through crowdsourcing and guys riding around on bikes is equivalent to commercial companies hiring vans to drive around those same areas.

In both cases it is advisable to know where your data came from!

Update: I knew this image reminded me of something. The Pioneer plaque:


