In Tests We Trust. Should We? (Part 3)

19 May

Over the last couple of days, we’ve visited an imaginary doctor, been diagnosed with an imaginary disease, and explored what it means to get a positive test result from a test that is 90% accurate.

Today we’re going to back up a little and ask another question. How did Dr. Filomena know that the test she used had an accuracy rate of 90%? How did she know that the incidence rate of Burgdorfer syndrome in the population is 1%? What were those numbers based on?

To determine the accuracy rate of Filomena’s test, we’d have to have a sample population in which we know with a high level of confidence which members have the disease and which do not. Then we’d be able to test everyone in the sample, compare their test result to their known health status, and determine how accurate the test is.

The larger the sample, the more precisely we can determine the test’s accuracy rate. The smaller the sample, the less confidence we can place in our accuracy rate determination, whatever it turns out to be. However, the larger the sample, the more difficult it will be to develop confidence in our knowledge about who has the disease and who does not.

To get that sample population, we have to use some other test (call it Edwin’s test) to determine which members of the sample have the disease and which do not. But how do we know the accuracy rate of Edwin’s test?

Well, we need a sample population in which we know who has the disease and who does not with a high level of confidence. Oh, wait, that’s what we’re trying to do with Edwin’s test. So we need Drury’s test to calibrate Edwin’s test so we can use Edwin’s test to calibrate Filomena’s test. Do you see the problem? We have an infinite regress. We can’t calibrate any test with precision without already having a precisely calibrated and trustworthy test.

But there’s even more trouble. We need to know what the average incidence of the disease is in the general population. To get that, we have to have a well-calibrated, trustworthy test (WCTT) that we can apply to a representative sample of the population. We’ve already seen the difficulty with assuming we can create a such a test.

Even if we had a WCTT for a given disease, its accuracy rate would not be 100%. So whatever estimate we developed for the disease incidence using the WCTT and a representative sample of the population, it would not be perfectly accurate.

The upshot is that both the 90% accuracy rate Dr. Filomena gave us for the test and the 1% incidence rate she gave us for the disease in the population, can only be estimates. These numbers contain some amount of uncertainty. It may be possible to refine them over time to minimize the uncertainty, but it can never be completely eliminated.

Any time you hear that such and such a disease affects such and such a percentage of the population, realize that the numbers are fuzzy. They are estimates, not counts. Some may be relatively precise, with minimal uncertainty while others may contain a large amount of uncertainty. Usually the new media don’t report the uncertainty. We have been trained to read and accept numbers uncritically as hard-edged, precise values. Scientific numbers almost always contain some uncertainty. Some of them contain a great deal. Take them with a grain of salt.

Remember the statement that has often been attributed to Mark Twain: “There’s lies, damn lies, and statistics.”

Leave a comment

Posted by on 2011/05/19 in statistics


Tags: , , , ,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: