The MapScore project described here provides a way to evaluate probability maps using actual historical searches. On a metric where random maps score 0 and perfect maps score 1, the ISRID Distance Ring model scored 0.78 (95%CI: 0.74-0.82, on 376 cases). The Combined model was slightly better at .81 (95%CI: 0.77-0.84).
Our MapScore paper is now in press at Transactions in GIS! From the abstract:
The MapScore project described here provides a way to evaluate probability maps using actual historical searches. In this work we generated probability maps based on the statistical Euclidean distance tables from ISRID data (Koester, 2008), and compared them to Doke’s (2012) watershed model. Watershed boundaries follow high terrain and may better reflect actual barriers to travel. We also created a third model using the joint distribution using Euclidean and watershed features. On a metric where random maps score 0 and perfect maps score 1, the ISRID Distance Ring model scored 0.78 (95%CI: 0.74-0.82, on 376 cases). The simple Watershed model by itself was clearly inferior at 0.61, but the Combined model was slightly better at .81 (95%CI: 0.77-0.84).
We compared the familiar distance-ring model from (Koester 2008) with a novel model counting the number of watersheds (ridge lines) crossed by the subject, and a combined model where the statistics are recalculated using both distances and #watersheds crossed.
The familiar distance-ring model did much better than the watershed alone, though we admit it is not a fair fight: as any Incident Commander knows, the distance-ring model has different statistics for each lost-person type. In contrast, the watershed model has only been learned for hikers, though we apply it to all cases.
However, although watersheds by themselves are much worse, the combined model slightly outperforms the distance-ring model. Yes, the effect is statistically significant, but it is not very large: about 3 absolute percentage points, about 13% of the possible gain. On the other hand, four such gains would yield more than 50% of the possible improvement.
But the point of the paper is not to argue for any of these simple models, but to present MapScore as a method for comparing models by testing the probability maps they generate on actual historical cases from ISRID, and to provide baseline numbers to beat.
Credits and Copies
The citation is:
Sava, E., Twardy, C., Koester, R., & Sonwalkar, M. (2015). Evaluating Lost Person Behavior Models. Transactions in GIS. In press.
Most of the analysis and ArcGIS wrangling was done by GIS student Elena Sava during her 4th year undergraduate and 1st year graduate studies in Geography and Geoinformation Systems at Mason. Elena is now working on her Ph.D. at Penn State. Dr. Mukul Sonwalkar helped with advanced ArcGIS scripting and final editing. Bob Koester (nearly Dr.) provided the ISRID data, guided the analyses, and helped with the final writing. I advised throughout.
The MapScore website was created by Nathan Jones and has been maintained and updated by Nick Clark, Jonathan Lee, and Hardhik Nadella. Original funding for MapScore came from an NSF REU grant through Brigham Young University.
Charles Twardy started the SARBayes project at Monash University in 2000. Work at Monash included SORAL, the Australian Lost Person Behavior Study, AGM-SAR, and Probability Mapper. At George Mason University, he added the MapScore project and related work. More generally, he works on evidence and inference with a special interest in causal models, Bayesian networks, and Bayesian search theory,
especially the analysis and prediction of lost person behavior.
From 2011-2015, Charles led the DAGGRE & SciCast combinatorial prediction market projects at George Mason University, and has recently joined NTVI Federal as a data scientist supporting the Defense Suicide Prevention Office.
Charles received a Dual Ph.D. in History & Philosophy of Science and Cognitive Science from Indiana University, followed by a postdoc in machine learning at Monash.
View all posts by ctwardy
3 thoughts on “Forthcoming MapScore Paper!”
See also the earlier post about subsequent work on the Lognormal distance model.
The Conditional Lognormal model clearly outperformed all of these. Note that all the scores in the Tr.GIS article are higher than those in the Lognormal blog post. The reason is that the tests in the article are based on N=376 cases, but for the Lognormal work, we added 48 more cases, for N=424. The new cases were generally harder.
Sava, E., Twardy, C., Koester, R., & Sonwalkar, M. (2016). Evaluating Lost Person Behavior Models. Transactions in GIS, 20(1), 38–53. http://doi.org/10.1111/tgis.12143
See also the earlier post about subsequent work on the Lognormal distance model.
The Conditional Lognormal model clearly outperformed all of these. Note that all the scores in the Tr.GIS article are higher than those in the Lognormal blog post. The reason is that the tests in the article are based on N=376 cases, but for the Lognormal work, we added 48 more cases, for N=424. The new cases were generally harder.
The paper is now officially published in the online edition: http://onlinelibrary.wiley.com/doi/10.1111/tgis.12143/full.
Sava, E., Twardy, C., Koester, R., & Sonwalkar, M. (2016). Evaluating Lost Person Behavior Models. Transactions in GIS, 20(1), 38–53. http://doi.org/10.1111/tgis.12143