The headlines yesterday reported that police facial recognition tools were ‘staggeringly inaccurate’. The idea is that cameras scan faces in a crowd – for instance, at a football match – and match faces to those on a police database, thereby helping the police to identify known offenders. On the face of it, the numbers do sound extraordinary, since over 91% of those identified as being a known suspect turn out to be innocent members of the public – but should we be ‘staggered’ to find out that the technology performs so badly? The maths suggests the opposite – it was almost certain that the numbers would be this poor.
The maths is not difficult. Let’s assume that the cameras are going to be used at a football match attended by 50 000 fans, and that every fan goes through the recognition software. Let’s also assume, for a moment, that this is a particularly crime-ridden football match and 10% of those attending are on the police database – so there would be 5 000 offenders in the crowd.
I’ve not been able to find out how good the technology is on a case-by-case basis, but let’s assume it is accurate 90% of the time – because that sounds like quite a good test and makes my maths easier in the process! That means that it will successfully identify 9 out of 10 of the criminals, and also 9 out of 10 of the innocent bystanders.
So for 5 000 criminals it will identify 9 out of 10, which is 4 500 and miss 500 – not too bad.
For 45 000 innocent bystanders it will correctly identify 9 out of 10, which is 40 500, but it will mistakenly identify 4 500 as potential criminals.
Put these together and we have 4 500 criminals and 4 500 innocent bystanders that the police think are criminals – a 50% success rate, or a 50% failure rate, depending on which way you look at it.
Now look what happens if we have a more realistic number of criminals in the crowd – surely even the worst football clubs don’t have 5 000 criminals at every match! What if only 1 in 1000 fans are criminals? Maybe I’m naive, but the idea that 50 known criminals are in every football match still seems like quite a high number to me. We can do the same maths.
Out of the 50 criminals, the police will identify 45 and miss 5.
Out of the 49 550 innocent bystanders, the police will correctly identify 90%, which is 44 595, but will incorrectly identify 10%, which would be 4 955.
4955 is a similar number to the first example, but now a far higher portion of the total, since there are only 45 identified criminals – a 99% failure rate. All of a sudden, the 92% failure rate by the police seems more understandable!
The big problem is that when you are looking for something rare, the chance of a false positive result (the innocent bystander being thought to be a criminal in this case) starts to massively outweigh the number of true positives unless your test is fabulously good. In fact, if only 1% of the crowd is a criminal, facial recognition software would have to get it right not 90% of the time, but over 99.9% of the time just to get a success rate of 50%. If the police have any mathematicians on board, they should know this.
So what does this mean for health? In health we often do tests where we are trying to find something important when the chances are much more likely that nothing serious is going on. The obvious example is screening – when we test patients who have no symptoms in order to make sure that if there is a problem then we find it early enough to deal with it – such as mammograms in the breast screening programme, PSA testing for prostate cancer, or a treadmill test to look for heart disease. Other examples are when doctors and nurses arrange tests routinely – such as an ECG and a Chest x-ray for most people who attend A&E with any hint of chest pain – or ‘just in case’ tests to make sure we are not missing anything. These tests may all be of value, of course, but the lesson we must learn is that when the chance of finding a true positive result is low, the problem of false positives can become a huge problem.
What really matters then is to consider the possible consequences of a false positive – it might be something fairly minor, like having to be recalled for a repeat blood test, or it could involve unnecessary procedures like a biopsy, or an invasive angiogram to check someone’s heart is ok. Then there is the anxiety that can be caused, and the waste of health resources spent separating false positives from true positives.
These are difficult, complex issues, but if we can learn something from the problem the police are facing, we might get better, as both healthcare professionals and patients, at asking these sorts of questions before we arrange any test:
- Do we need to do this test?
- How likely is the condition that we are testing for?
- How likely is it that the test will result in a false positive?
- What are the consequences of a false positive?
- What might happen if I don’t do the test?
Challenging stuff!