Incoherence & Mattson

The following figure is from a recent paper I co-authored*:

Figure from Karvetski et al. 2014 showing we get more accuracy by ignoring incoherent estimates than by simple unweighted averages.

Figure from Karvetski et al. 2014 showing we get more accuracy by ignoring incoherent estimates than by simple unweighted averages.  (The unfortunately abbreviated 'BS' means 'Brier Score'. Lower is better, with 0 being perfect.)

What implications does it have for making subjective "consensus" probability maps at the start of a search?

David Mandel has just blogged a summary of our recent Decision Analysis paper**.  In laboratory conditions on general knowledge questions, we found:

  • People are often incoherent: their probabilities don't add to 100%.
  • We get an 18% gain in accuracy if we coherentize their estimates.
  • But we get a 30% gain in accuracy if we also assign more weight to coherent estimates.

Suppose this applies to making subjective probability maps -- and we don't know that it does.  Recall the original "Mattson" consensus asks everyone to put probabilities in each region.  People are bad at this. Often they don't add to 100%, so people correct by getting to about 80% and then just dividing the rest.  So we have invented decision aids to help them.  The Proportional method says use whatever numbers you like, and normalizes.  The O'Connor method uses verbal cues (A = "very likely" ... I = "very unlikely"), then gives each letter a number 9..1, and uses the Proportional method.  Etc.

I tend to fall in the "use whatever method you like best" camp.  Usually that means Proportional or O'Connor.  But if our result applies to making subjective probability maps, it would suggest:

  • The O'Connor & Proportional will beat Mattson, simply by forcing coherence.  No surprise there.
  • But coherence-weighted Mattson might do better still.  Our "decision aids" might be hiding carelessness, incapacity, or neglect which we would do better to recognize and ignore.

I think it's time for a follow-up study.  I might call Ken Hill, whose earlier study formally established that subjective maps are subject to some standard probability biases.

              

* The heavy lifting was done by postdocs Chris Karvetski & Ken Olson, with excellent design & writing input from David Mandel.

** Karvetski, Christopher W. and Olson, Kenneth C. and Mandel, David R. and Twardy, Charles R., Probabilistic Coherence Weighting for Optimizing Expert Forecasts (July 26, 2013). Decision Analysis, 10(4), 305-326. Available at SSRN: http://ssrn.com/abstract=2411649

 

 

This entry was posted in Search Theory and tagged , , , , , . Bookmark the permalink.

5 Responses to Incoherence & Mattson

  1. David Mandel says:

    What's the probability that normalization of probabilities comes up in two blog posts in the same day AND that I comment on both AND that that day is today? One:
    http://blog.massimofuggetta.com/2014/04/02/wimbledons-winner/

  2. Don Ferguson says:

    "coherentize" = normalize ?

    • ctwardy says:

      It's a generalization of normalization to any related set of probabilities: find the coherent set of probabilities that is the least different in terms of squared deviation from the ones provided to us.

  3. Don Ferguson says:

    I prefer the Proportional method but O'Connor is ok however it doesn't offer the resolution you can get with Proportional method. As you stated people are not good at appropriately distributing probability for a fixed range (Mattson) particularly if the distribution must be divided over many regions. I am speaking anecdotally here, but if someone were asked to divide the probability over two or three decisions (regions) they could probably do a good job. But asked to divide it over 7 or more, I think it gets difficult to maintain coherence. I imagine someone has done a study to determine the optimal number of hypotheses (regions in the case of SAR) in order to maintain coherence. Of course it will vary by situation/individual but it would be interesting to see if there is a statistically preferred number of "decisions" in order to maintain coherence.

    In the business world and the realm of incident command, practitioners talk about "span of control". Recommended values are all of the board (http://www.economist.com/node/14301444). Within the context of Incident Command, the recommended value is 3 - 7 with 5 being optimal (http://training.fema.gov/EMIWeb/is/ICSResource/assets/reviewMaterials.pdf). Maybe this would also apply to maintaining coherence in decision (regions) making as well. If a reviewer were asked to limit decisions to 5 as opposed to 12 would they be able to maintain coherence (as defined in the blog posting)?

    With regards to SAR Regions of Probability in situations when there are a large number of regions, we may want to compartmentalize the regions so the review is only asked to review no more than 5 at a time.

    • ctwardy says:

      That's if we want them to be coherent. The radical idea here is that we might want to allow people to be incoherent. If worse forecasters are also less coherent (or lose coherence faster), let them show their hand, and correct for that by giving them less weight.

      Mind you, this could be a really bad idea. I wouldn't try it in the field just yet.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>