Saturday, January 7, 2017

Accuracy vs. Precision in the M.A.P.



"You're traveling through another dimension, 
a dimension not only of sight and sound but of mind. 
A journey into a wondrous land whose boundaries 
are that of imagination. That's the signpost up 
ahead - your next stop, the Twilight Zone."
Rod Serling


In a recent post I talked a little about how precise ideal lat/lon pairs are.  One thing I said was that a measurement to one one-millionth of a degree resolves to about four inches in latitude and longitude (4.4 in. in latitude equals 11.176 cm).  Any system that provides measurements like that is said to have a precision of one one millionth of a degree.   In that discussion I ignored the difference between accuracy and precision.  The lat/Lon pairs I provide from Google Earth have a precision of one one-millionth of a degree.  But there’s the additional question of Google’s accuracy.  Imagine that Google Earth was a giant machine for producing lat/lon pairs.   How accurate are those pairs when they come out of the machine? How close do these pairs come to an idealized model of the earth?  Do Google's numbers accuracy match their precision of representation? No.  It appears that they don't.

This is the same as asking how well Google has fitted its source photography to an ideal representation of the earth.  Do the photographs match the ideal grid of the earth itself?  This is a question worth asking because many operations have to be performed on the aerial photography before it’s presented on-line.

For a brief tour of orthorectification issues see this. In this study the authors checked how well Google's lat/lon pairs matched up with lat/lon pairs from a verified data set. The whole article is worth reading. Their conclusion?

"Using accurate field and photogrammetric measurements (extracted from a cadastral database) as the reference dataset and comparing them against well-defined and inferred locations (CPs) in GE’s medium and high resolution imagery, the estimated horizontal positional accuracy of GE’s imagery over rural areas (5.0 m RMSEr) was found to meet the horizontal accuracy requirements of the ASPRS (1990) for the production of “Class 1” 1:20,000 maps. "[1]

The RMSEr is an estimator of the standard deviation based on model results.  So you could, as a rule of thumb, think of 5 m. as the standard deviation of Google's modeling error.  This would mean that 68% of the time the Google lat/lon pair is within 5 m. of the actual position of the sought-after object and about 32% of the time it's further away than 5 m.

But the authors also add some cautions:

"However, the results also suggest that this accuracy requirement might not be met for rural areas if coordinates are extracted only from GE’s medium resolution imagery or from imagery collected before 2008. Furthermore, despite the results presented here, GE’s imagery should be used with caution due to the presence of large georegistration errors in both GE’s medium and high resolution imagery."[2]

In other words we are being warned against actual positioning and alignment errors in Google Earth's images. This can be easily seen if you pick a specific feature on an image and then drop a marker on it for each of Google's available images at that location. Let's look at an example. Here we have a church in Messenia called the Panagia (37.033444°, 21.737154°). If you look into the field across the road you see a circular field feature (I think that it's a well).


I brought up the 'Show historical imagery' slider and marked that well on each layer. The result was this:

Positions of a field feature based on four different available images in Google Earth
Here we see that the field feature (along with everything else) appears to drift and to appear to be associated to different lat/lon pairs depending on the date. The radius of the circle which includes them all is 11.32 meters. So, there's some surprising drift in Google's image alignments. Not fatal but something to take account of.


And just to emphasize what's going on I also show just the image from the May 20, 2003 plate:

Here the entire image has 'drifted' under the markers (which are fixed) until the
5/20/2003 marker is over the field feature.  Notice the displacement of the 'Panagia' label
which should be over the right-side building on the upper left.  This label is displaced nearly
22 m. from where it started.



So there are several potential sources of error in my DB lat/lon.  The first is the degree of Google’s fidelity to an underlying model of the earth's surface, the second consists of Google's alignment of its images. I should just wrap this up by saying that even though GE provides measurements with a precision of 10^(-6) or one one-millionth of a degree (i.e. about 4 inches) the accuracy it provides is, perhaps, a little better than 10⌃(-4) or one ten-thousandth of a degree which at 37 degrees north latitude is about 353 inches (8.96 meters). And this does not take into account imagery offsets.

The third kind of error is the error I introduce when I choose a lat/lon pair to represent a gazetteer entry.

For my general concept of my own 'Introduced' error let's say that we were looking for the field feature mentioned above (the well) and I had only this (entirely made-up) written description as to its whereabouts:

"The church of the Panagia is a kilometer or so to the northeast of the town of Myrsinochori. About 20 or 30 m. to the east of the driveway leading to the church there is a field feature which consists of a stone circle. It is about 10 m. south of the road ..."

Now, given that I could find the Church of the Panagia at all (it is 1200+ m. in a straight line from the northeast edge of Myrsinochori to the church and nearer 1500 m. by road) I would proceed to follow the directions and mark my field feature in good faith like this (I'm pretending that I can't actually see the feature in GE):



In this map the 'H Line' is 25 m. in length (halfway between 'twenty or thirty' meters).  The 'V Line' is 10 m. in length.  After having drawn both those lines (and, I emphasize, based on the written description) I would put a marker at the S end of the vertical line and advertise the lat/lon pair of that push pin as the location of the sought-for feature.  If I had really proceeded like this I might very well feel that my mark was within ten meters of the real feature.  And I would associate my lat/lon pair (at the yellow push-pin) with an introduced error term of ten meters.  In this case I drew a circle with a ten meter radius centered at the push pin.  This circle does, indeed, touch the field feature I'm trying to mark. 

This introduced error term is intended to reflect how well I think I’ve located the object of interest.  I have defined this error term as the radius of a circle, in meters, centered exactly at my lat/lon pair and which covers some part of the sought-for feature.   For example, if the feature can actually be seen in GE then I mark the feature and set the error term to zero.  In that case the "real" error is reduced simply to Google’s accuracy at that point. Given that the features are of various types a non-zero introduced error radius has several possible meanings.  If my introduced error term is ten then, finding yourself exactly at my lat Lon pair means that you are distant from the object by, at most, 10 meters plus Google’s error.  If Google’s error term is 5 meters then, in the worst case, you are fifteen meters away from the goal.  At best the two errors would offset and you would be 5 meters from your goal.

If I can’t see the feature but the description is constrained in some way, a cave opening or a narrow hill- or ridge-top, then I set the introduced error radius to 10, 20, or perhaps even fifty meters.  When the directions available to me are imprecise but I know generally where the feature “ought to be” then I’ll set the error radius to one hundred or two hundred meters. A feature on a "hill-side" would be the classic example.  Sometimes directions to a small site or find are described as being in a certain town.  In such a case, and with no additional info, I will put a marker on the town but set the error term to ‘N’ or ‘unknown’.   I hope it's clear, from the foregoing, that these introduced error radii are subjective only.  They are merely my opinion about how well I did after taking everything into account.  Of course, they’re not fixed in stone, either.  If I rethink an area or if I receive more accurate information from someone who’s been there then the error term can be driven to zero.  In that sense they’re simply a progress report of accuracy; ultimately my introduced error radii should all be driven to 0.

Remember that even small introduced error radii can specify very large areas.  An error radius of ten meters describes a circle with an area of 314 square meters or 3379.0 square feet which is about half the size of the average house lot.  In my DB that’s the best non-zero case.  A 20 meter radius specifies a circle with an area of about 1256 sq m. or 13519 square feet.  This is about the size of two average house lots.  A 30 meter radius specifies a circle of about 2827 sq m.  A fifty m radius a circle of about 7853 sq m.  If the error radius is 100 m then you should  imagine  a circle with the radius of a football field.  On rocky and rugged terrain (not unknown in Greece) such an object is still lost.  On flat terrain (the golf course at Pylos comes to mind) such a radius might be feasible.  Much larger than that and you should consider the object is still not found in any useful field sense and you’ll want to do additional research before going out to the field.

Finding something successfully also depends on what you’re looking for.  There’s a huge difference between looking for a plainly visible hilltop fort on the one hand or some area where, long ago, some researcher found a single sherd.  In the first case you may have sloppy and inaccurate directions but that makes no difference because you can see the feature from a kilometer away.  In the second case you may have directions that are accurate and precise; you may reach the exact spot and stand exactly in Richard Hope Simpson’s footprints and still not be confident that you have found the right place because, on the day you’re there, no sherds are visible.  In that case the error term takes on the subtle meaning of extent.  It indicates over how much area I think a reported sherd scatter should extend.  An introduced error term can also be interpreted as a degree of confidence. It can designate the area where I'm most confident of finding the feature but, granted, the desired feature may still be outside the circle.

This raises the question about what my lat/lon pairs are intended to facilitate, anyway.  What are they for?  First of all I hope that they can be of some assistance to students who are reading about the Mycenaean sites and have no prior familiarity with where those sites are.  I hope, also, that this DB can be of help to researchers that are planning to go into the field.  But it’s more than that.  My very strong feeling is that, in the field, and no matter what you find, whether it’s a worn, barely recognizable sherd or a palace complex, Datum One is where the object was found, exactly.    Why is location so important when generations of archaeologists have supposed it to be unimportant?  Location is important because only that can relate your specific find to everything else.  For example, how far is it on the average from a BA habitation to a water source?  What’s the standard deviation of that distance?  What’s the average elevation of a BA settlement, tholos, chamber tomb?  Is the average habitation above or below the average BA cemetery?  What’s the average distance from a habitation to its associated cemetery (when such an association can be determined)?  How many BA habitations do we know that were within 100 m of the ocean?  500 m?  1000 m?  Did the Mycenaeans live in the mountains?  What proportion of BA habitations were obviously maritime in orientation or were not so oriented?  How many habitations with a LHIIIB2 burn layer are there and how are they distributed, exactly?   How about some accurate and useful maps of all those variables?  

All of the foregoing questions are quantitative questions/problems/techniques and none of them can be answered without accurate locations, and not only that but accurate locations for every object site in the field of study.  In this respect, at least, every sherd is the equal in significance to every megaron.

Here's a practical example.  Earlier this year Dr. Michael Galaty sent me the URL for an article that he and his colleagues had written about Mycenaean civilization's place in the World System.  The article is here.  It is a very interesting article; part of its method is to calculate slopes around various Mycenaean locations in Messenia and in the Argolid.  To obtain the slope for a particular place you divide the change in altitude by the change in distance over which the altitude is measured.  Slope is really just the tangent of the distances involved; the lower the number the smoother the landscape; the higher the number, the steeper the landscape and when the slope approaches infinity you're dealing with a cliff or something like that.  Part of Dr. Galaty's intention was to show that Messenia and the Argolid differ with respect to the generalized concept of slope in their respective landscapes.  I only bring up his article in order to point out that he and his colleagues had to determine, one by one, the exact positions of the Mycenaean sites in which they were interested.  As he says:

"It was a difficult and time-consuming process to identify sites with the accuracy demanded by Geographic Information Systems (GIS), although Google Earth and Hope Simpson and Dickinson’s Gazetteer were indispensable resources in this regard. As a result, only some of the more important sites are included, and they may not be precisely located in our GIS. Though we did not visit each of them with a Global Positioning System (GPS), we are confident that our GIS database is accurate enough and our results meaningful."  (emphasis is mine)

I intend no criticism of this very useful article.  My feeling is just that it's too bad that Dr. Galaty and his colleagues did not have access to a large accurate database of Mycenaean find spots and, consequently, had to perform a lot of work to create the DB they needed. If they're having this kind of difficulty then everyone in the field must be having the same difficulty.

Mycenology is a science.  Experiments in science have to be repeatable.  The definition of repeatable also includes, at a minimum, 'locatable'.

Let's get Mycenology out of the Twilight Zone.



~~~


If you like these posts then please follow me on Twitter (Squinchpix) or on Google+   (Robert Consoli)

Anyone who wants a copy of my Mycenaean DB or an importable file to Google Earth with some 1400+ Mycenaean find-spots accurately located just leave a comment here or send me an e-mail at bobconsoli (at) gmail.com

By the way, I've just learned that Hope-Simpson and Dickinson's Gazetteer (1979) to which I've never had access (over $100.00 most places) is for sale, brand-new, by the publishers (Astrom Editions) for about 32 euro.  With shipping it should be around $40.00.  

Update: January 2, 2017:  I've just found that Hope-Simpson/Dickinson Gazetteer is available online through Scribd.  Scribd is a subscription service and I don't subscribe but I was still able to have access to the entire document for some reason.  Maybe you will too.

Notes

[1] Paredes-Hernandez et al. [2013], p. 598.
[2] Idem.
[3] Galaty [2012], 450, 'Landscapes'.

Bibliography

Galaty [2012]: Galaty, Michael L. and William A. Parkinson, Daniel J. Pullen, Rebecca M. Seifried. "Mycenaean-scapes: Geography, Political Economy, and the Eastern Mediterranean World-System", in Physis. L'Environnement Naturel et la Relation Homme-Milieu dans le Monde Égéen Protohistorique, pp. 449-454 and Plates CXXXVII to CXLI. In Actes de la 14e Rencontre égéenne internationale, Paris, Institute National d'Histoire de l'Art (INHA), 11-14 décembre 2012. Edd. Gilles Touchais, Robert Laffineur et Francoise Rougement. 2012. Online here.

Paredes-Hernandez et al. [2013]: Paredes-Hernandez, Cutberto and Wilver Enrique Salinas-Castillo, Francisco Guevara-Cortina, Xicotencatl Martinez-Becerra, "Horizontal Positional Accuracy of Google Earth's Imagery over Rural Areas: A Study Case in Tamaulipas, Mexico", Boletim de Ciências Geodésicas, vol.19 no.4 Curitiba Oct./Dec. 2013. Online here.

No comments:

Post a Comment