If I were to ask you the top five things you thought about when I said the word “future”, I am almost certain that you would mention artificial intelligence (AI) and advancing AI technology. We are supposedly on the precipice of a paradigm-shifting event that journalists, management consultancy firms, and data scientists believe heralds a new dawn in healthcare.
Market valuations and forecasts paint a similarly optimistic picture, with projections showing that the AI healthcare market is expected to surge from $4.9 billion in 2020 to $ 45.2 billion in 2026, with a compound annual growth rate of 44.9 %¹. Unsurprisingly, the number of AI healthcare startups has also boomed, with 367 new start-ups receiving an estimated $4 billion in funding last year according to Accenture.
However, I am less certain about AI’s future in healthcare. The hype surrounding AI has reached such a furor that talks of “robo-surgeons” and “radiology replacements” have dominated public discourse in recent years, distracting us from evaluating the evidence behind these claims.
This article will demonstrate how the limitations of AI in healthcare have been brazenly glossed over with exaggerated claims, overpromises, and impractical scalability. It will highlight how we must significantly reduce our expectations of AI’s ability to transform tomorrow’s healthcare.
1. Data collection
Machine learning algorithms require a significant and comprehensive amount of labeled data input to classify data and predict outcomes with a high degree of accuracy. One way of obtaining this amount of data would be through obtaining electronic health records that integrate health and social care throughout the patient journey.
In the UK, an investigation into medical record-keeping in hospitals by the Institute of Global Health Innovation (IGHI) found that 23% of NHS Trusts were completely reliant on paper records³. That means 35 hospitals around the country have millions of patient data that remain locked and inaccessible.
Of the 77% of trusts that did use some form of electronic healthcare record-keeping, researchers discovered that systems were unable to record and share information effectively. This is highlighted by the fact that on 11 million occasions patients attended a hospital that could not access their full medical information from a previous visit².
Fragmentation of systems remains a significant barrier toward the implementation of AI. The critical information that is captured on the journey of patient care is, at best, inconsistent and, at its worst, unreliable. Over 4 million patients who attended two or more NHS trusts between 2017–2018, encountered hospitals with electronic record systems that were completely incompatible³.
Promises of Electronic Healthcare Records (EHR) that would create and sustain a wealth of data has clearly failed to materialize. Extracting quality data remains the bottleneck that severely inhibits AI integration with healthcare. Furthermore, the process involved to identify, organise and decodify the relevant data for real-life clinical practice makes AI seem a reality in a far, far distant future.
The failures of NPFIT, a government initiative that aimed to digitalise all medical records in the UK, failed so spectacularly that it was dismantled in 2011, being described as “the worst contracting fiasco in history”. It cost the government approximately £10 billion pounds.
Wearable technology and healthcare apps
It is true that wearable devices and applications may present a potential solution to quality data collection and interpretation. One study conducted by researchers at Stanford showed that the Apple Watch was able to detect irregular rhythms in wearers with a positive predictive value of 71%.
However, the main caveat to implementing this technology is security. Healthcare data obtained outside the doctor-patient relationship is akin to the Wild West, an open frontier poised for exploitation by bad actors.
Take the US federal trade commission, which tested 12 apps and two wearable devices. They found that data collected was transmitted to no less than 76 undisclosed third parties⁴. Perhaps more frightening is how period-tracking apps have shared the most intimate information on people’s lives, including when they had sex, to Facebook. This data sharing was performed without the informed and explicit consent of over 5 million app users⁵.
2. Algorithm bias
AI machines are thought to exist without prejudice, discrimination, and bias. Unfortunately, an increasing number of peer-reviewed studies have put that argument to bed. Consider the decision-making algorithm widely used by US hospitals to determine which patients receive additional medical care; the algorithm affected the care of approximately 2 million patients and researchers found it routinely discriminated against black people.
This study examined the data sets used in the algorithm and showed that “the care provided to black people cost an average of US $1,800 less per year than the care given to a white person with the same number of chronic health problems”.
The bias in the algorithm existed because it used healthcare costs as a proxy for healthcare needs, disregarding the fact that unequal access to care means there is less expenditure on black patients than white patients⁶.
Sourcing a diversity of data is difficult because minority groups have historically been underrepresented in clinical trials. This has been highlighted by the fact that in 2021, despite disproportionately higher infection, hospitalization and death rates in ethnic minority groups, the “direct effects of genetic and biological host factors remain unknown”.
Even if diverse data sets were used, bias can still arise through a lack of diversity in system developers, implicitly biased assumptions about the data being used, and the output data from AI itself, further perpetuating bias.
Using AI algorithms without extraordinary levels of due diligence risks deepening an already fractured society embellished with inequality and inequity.
3: Randomised controlled trials
Peer-reviewed evidence in randomized controlled trials (RCT) remains the gold standard for research. Yet, there have been few RCT’s for AI-based systems in healthcare. In addition to this, the overwhelming majority of RCTs that have been conducted are retrospective and have not been tested in real-life scenarios.
Prospective studies allow us to test and understand the viability of AI systems, as the performance of algorithms is likely to worsen significantly when the data being inputted is disorganized, chaotic, and different from the training algorithms used in practise.
To highlight this point, a systematic reviewconducted by researchers from Imperial College London, compared the performance of AI deep learning algorithms versus clinicians in diagnostic imaging. The results showed that only 10 records were found for AI RCT’s, and only 2 of them had been published.
In addition to this, 71% of papers analyzed had a high risk of bias, and reporting standards were suboptimal (<50% adherence for 12 of 29 TRIPOD items). Only one study produced the raw data and code.
To even begin to consider the possibilities of AI in healthcare, rigorous prospective RCTs must be conducted before any conclusion can be made about the effectiveness of such systems. Take this year alone, more than 3 vaccines are currently being distributed worldwide in the hopes of ending a pandemic. This would not have been a possibility without the stringent use of prospective RCTs.
And yet, the FDA has approved 16 algorithms for commercial use in the USA that affect millions of people’s health outcomes, and only one of these algorithms have undergone RCT testing⁷.
Using AI algorithms without reliable evidence is bordering on negligence. The use of robust, validated and transparent data has to be displayed in peer-reviewed studies before AI systems play a role in healthcare. Otherwise, the sensationalism reported in media could lead to the implementation of systems that are not fit for purpose, causing harm to millions of people.
4. Human factors
The adoption of new systems in healthcare is often met by a culture of resistance that is impervious to change. Problems with user-friendliness, accessibility, and slow speeds inhibit the uptake of technology that often sounds like a breakthrough in a sterile laboratory setting.
A study conducted by Google Health into the effectiveness of its detection rate of Diabetic retinopathy found that socio-economic factors affected the model’s performance in a clinical setting. This ranged from not enough lighting in a clinic room, to internet speeds being too slow, resulting in a large number of images being discarded.
Whilst human factors are often ignored when the implementation of new systems arise, they are perhaps, the most important to consider. There are numerous studies showing that failure to adopt promising technology boiled down to a failure to address the specific needs of the very people who will be using the technology. This is a costly mistake to make, given that millions if not billions of money may be spent in the future on the implementation of these systems.
In conclusion, we must stop this nebulous AI hype train that has derailed our expectations of what these systems can actually do. We are setting ourselves up for bitter disappointment when AI is unable to deliver a future that was over-promised. The lessons of IBM’s supercomputer, Watson, hailed as a transformation in cancer care failed to deliver on every single metric related to health outcomes. And yet here we find ourselves again, battling the same misinformation and hype that has characterized the last decade of our lives.
I leave you with the saying that has stood the test of time — all that glitters, is not gold.
References
- Research M, Trends A. Artificial Intelligence in Healthcare Market with Covid-19 Impact Analysis by Offering, Technology, End-Use Application, End User and Region — Global Forecast to 2026 [Internet]. Reportlinker.com. 2021 [cited 27 January 2021]. Available from: https://www.reportlinker.com/p04897122/Artificial-Intelligence-in-Healthcare-Market-by-Offering-Technology-Application-End-User-Industry-and-Geography-Global-Forecast-to.html
- Poor NHS medical record sharing is putting patient lives in danger [Internet]. Health Europa. 2019 [cited 27 January 2021]. Available from: https://www.healtheuropa.eu/nhs-medical-record-sharing-patient-lives-danger/95545/
- Sandhu H. AI revolution in healthcare will occur when the tyranny of the screen ends — MedCity News [Internet]. MedCity News. 2020 [cited 27 January 2021]. Available from: https://medcitynews.com/2020/03/ai-revolution-in-healthcare-will-occur-when-the-tyranny-of-the-screen-ends/?rf=1
- The Federal Trade Commission strikes back against Facebook [Internet]. Ft.com. 2021 [cited 28 January 2021]. Available from: https://www.ft.com/content/cd618d7a-bde8-4aec-86c7-f59ff784b4b4
- Sex lives of app users ‘shared with Facebook’ [Internet]. BBC News. 2019 [cited 29 January 2021]. Available from: https://www.bbc.co.uk/news/technology-49647239
- Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial bias in an algorithm used to manage the health of populations. Science. 2019;366(6464):447–453.
- Waters R. Human dimension presents new hurdles for AI in medicine [Internet]. Ft.com. 2020 [cited 29 January 2021]. Available from: https://www.ft.com/content/151dbf12-970d-49af-9d10-d417921f7066