Could data save the NHS?
If we want a better future for the NHS, we need a collective commitment to ethical, transparent and innovative data practices, says Sir Nigel Shadbolt
Fewer than one in 4 people are satisfied with the NHS – which is the lowest number ever recorded at the time, in 2023. At the same time, there’s little doubt that AI and data could help address many of the NHS’s challenges and potentially revolutionise clinical care and medical research across the public and private sectors. However, the Open Data Institute’s (ODI) latest white paper, Building a Better Future with Data and AI, has identified significant weaknesses in the governance of data infrastructure that threaten potential gains for healthcare.
World leaders in health data reuse
It’s widely accepted that data has the power to improve the quality, safety and cost effectiveness of care across the whole health service. With the recent development of AI technologies, the opportunities have advanced. Tech companies like IBM are eagerly pointing to AI’s potential to reduce treatment costs by 50 per cent and improve health outcomes by 40 per cent. Wes Streeting, the new UK secretary of state for health and social care, is keen to press home an advantage and has ambitions for the UK to become “a powerhouse for the life sciences and medtech”.
Yet before the hype cycle reached new heights with the widescale availability of generative AI, data was already playing an important role in clinical research and healthcare delivery in the NHS with initiatives such as Opensafely, Health Data Research UK and Our Future Health working on improving our understanding of patterns of disease, diagnostics, outcomes, personalising treatment plans and enhancing operational efficiency.
The UK is a world leader in health data reuse with, for example, projects like Insight and Retfound training AI models on an NHS dataset of 25m retinal images to diagnose eye diseases and predict conditions such as Parkinson’s, strokes and heart failure. The NHS AI Lab is already using machine learning on a large scale, with NHS datasets in use to help predict disease progression, identify high-risk patients, and recommend personalised treatments. Recently, it was reported that “artificial intelligence that scans GP records to find hidden patterns has helped doctors detect significantly more cancer cases. The cancer detection rate has risen from 58.7-66.0 per cent at GP practices using the “C the Signs” AI tool.
Bogeyman fears
The use of sensitive health data raises important questions about data privacy and security. Palantir’s involvement with the NHS Federated Data Platform led to journalists describing the company as the ‘bogeyman of surveillance tech’. Even if they haven’t heard of Palantir, too few people are convinced that the benefits of data sharing outweigh the risks of their information being hacked or used for marketing and insurance purposes.
Recent events underscore the importance of winning – and then maintaining – public trust. Following the General Practice Data for Planning and Research (GPDPR) campaign in 2021, patient opt-outs from sharing health data nearly doubled, rising from 2.75 per cent to over five per cent of the total English patient population in just one month.
Those organisations wishing to access, use and share health data have to be able to reassure the public that they can be trusted if sensitive data is to be used to advance medical knowledge and develop innovative therapies. However, the UK’s laws around data protection pre-date AI and large-scale data sharing. Lawmakers must build on the UK GDPR to provide more individualised control over data in the era of AI. This regulation needs to engage with the characteristics of how data is used for AI to ensure it continues to function in the interests of people and communities as well as industry, and should go beyond training data to include prompts and various forms of feedback.
The feedstock of AI is data
There are other challenges, too. While the NHS has a huge amount of data, it is often held in siloed IT systems and is not available for use across the health service ecosystem. More than 200 NHS trusts collect patient data through different electronic health record (EHR) systems, which can make data sharing challenging. And that isn’t the only data that AI models might use; data is also available from medical imaging, genomics, and wearable devices. To facilitate knowledge sharing, we need standards for governing and sharing this data. In our recent research, the ODI recommends adopting FAIR principles to ensure datasets and models are discoverable and usable and promote better data management and sharing.
To achieve this, it’s important that the government starts to view data as part of our national infrastructure in the same way as roads or the electrical grid. Well-curated data infrastructure is foundational to delivering the value of AI effectively and responsibly. This includes interoperable open standards to represent the data, secure data storage, efficient data processing, and strict adherence to data protection regulations – none of which are in place yet. In fact, the Open Data Institute’s research has shown a notable scarcity of information about the governance of the data used or produced by AI systems.
I have recently said that if the UK is to benefit from the extraordinary opportunities presented by AI, the government must look beyond the hype and attend to the fundamentals of a robust data ecosystem built on sound governance and ethical foundations. We must build a trustworthy data infrastructure for AI because the feedstock of high-quality AI is high-quality data.
Safety first
While we wait for the development of a national data infrastructure, new developments in machine learning are enabling researchers to access and analyse sensitive data without compromising privacy, commercial sensitivity, or national security.
In recent years privacy-enhancing technologies have emerged. These innovations have great potential to help safeguard people’s rights and privacy as AI models become more prevalent.
Federated learning (FL) is a paradigm shift that allows algorithms to access data from multiple local datasets without exchanging the underlying data. FL has shown promising results; for example, Oxford University’s CURIAL-Federated platform developed a rapid response Covid-19 screening test by training data across four NHS Trusts, with participating hospitals retaining custody of their data at all times. Combining FL with other privacy technologies, such as differential privacy or secure aggregation, can keep the data even more secure, minimising the risks of patient data hacks like the attack on Synnovis in June.
Secure data platforms and technologies such as the Secure Data Environment (SDE) and OpenSAFELY give researchers access to anonymised NHS data via secure data and research analysis platforms. This equitable access is essential if we want to prevent a ‘data winter’ where only big tech is able to work with, make decisions about and derive value from data and AI technologies. When this relates to our health, it’s even more important to avoid creating a monopoly by restricting access to data. We should also look at new decentralised architectures such as Solid from the ODI’s co-founder Sir Tim Berners-Lee that offer individuals the prospect of direct access and control of their own wellbeing and health data.
The time for action is now
If we want a better future for the NHS, we need a collective commitment to ethical, transparent and innovative data practices, from ensuring broad access to high-quality, well-governed data for key datasets to enforcing data protection and intellectual property rights. To enable data to improve the NHS and help solve many other key societal and governmental challenges, the government must take five actions:
- Ensure broad access to high-quality, well-governed public and private sector data to foster a diverse, competitive AI market;
- Enforce data protection and labour rights in the data supply chain;
- Empower people to have more of a say in the sharing and use of data for AI;
- Update our intellectual property regime to ensure AI models are trained in ways that prioritise trust and empowerment of stakeholders;
- Increase transparency around the data used to train high-risk AI models.
Without these measures it will be difficult to realise Wes Streeting’s vision of Britain as a MedTech powerhouse and for the NHS to unlock the opportunities offered by data-centric AI.