The UK is now established as a global leader in the collection of health and genomic data and the use of those data for accelerating successful biomedical research. In the high-risk world of medicines development, how are the data resources, expertise and research infrastructure in the UK helping to increase the success of clinical trials through better patient selection?
The aim of drug development is to make new, safe, effective medicines available to treat those patients most likely to benefit. To this end, clinical trials – the engines of drug development – must, at least, satisfy two critical features. First, the trial must evaluate the right drug target. And second, the patients enrolled in the trial must be those most likely to derive greatest benefit while risks to their health from the new medicine are minimised. Understanding of the relationships between the genome and disease support both these aims. Considerable success has been had in cancer drug development, but here I focus on non-cancer illness where many valuable opportunities present themselves.
Identifying the correct drug target is the first key step to developing an effective and safe new medicine. Analyses of hundreds of drug development programmes over the past 20 years have demonstrated that drugs for which the target is supported by human genetic evidence are around twice as likely to succeed in their progression from first-in-human study to approval (Nelson MR Nat Genet 2015; King EA PLoS Genet 2019). Datasets such as the world-leading UK Biobank bring genomic data for hundreds of thousands of individuals together with information about those individuals’ health over time through linkage with NHS records. Datasets of this type are powerful tools for finding new relationships that highlight disease-causing genes that can be prioritised as promising drug targets (Szutakowski JD et al. Nat Genet 2021).
Beyond informing the selection of the best drug targets, patient data resources offer opportunities to understand which patients are most likely to benefit from a medicine in development. Datasets with rich information on large numbers of people allows subgroups within the dataset to be examined in detail. Those subgroups may be defined by clinical features (e.g. rate of disease progression, levels of blood markers, imaging), using genomics (e.g. whether a person carries one or more genetic variants of interest), other ‘omics, such as blood proteomics, or a combination of these. Particularly informative is information gathered about individuals over time, offering insights into when in their life a disease begins or is diagnosed, how rapidly it progresses, and what treatments they have. Since most medicines are developed for the treatment, as opposed to the prevention of, illness longitudinal data of this sort can help to identify which patients might benefit most from a new treatment and at what point in the progression of their disease. And, since the data are generally collected from members of the general public or through routine clinical care, they can be representative of ‘real life’ patients and their experience. This makes for an important contrast with data from existing clinical trials which is often incompletely representative of day-to-day clinical practice and patient experience. Some emerging therapeutic technologies such as short interfering RNAs (siRNAs), which include the recently approved inclisiran, can have treatment effects lasting several months. Being able to identify patients most likely to benefit from such a drug has clear value for the patient, the clinician and the health service. More generally, gaining a deep understanding of the course of different types of patient through their illness from diagnosis or even earlier can inform drug developers about who should be selected for clinical trial eligibility and, hopefully, increase the chance that that new medicine will offer benefit to those who need it most.
Linking together health records data, other disease-related information and genomics is an enormous challenge, and here again the UK is a leading light. Health records data have great utility in enabling understanding of patients’ illness and their path through diagnosis and treatment. UK Biobank has succeeded in linking historical and prospective hospital and primary care data to its dataset, as has the Genomics England 100,000 Genomes project. In Scotland, the SHARe resource enables linking of health records to a biobank storing surplus blood samples collected during routine clinical care. There are many other established data resources of a similar nature, including several focussed on particular types of disease. Following the strong track record of UK investment in health data, the Our Future Health programme will create a dataset of up to 5 million people with genomic data linked to health records. Health Data Research UK (HDRUK) is a nationwide academic organisation that enables research access to health records and develops new methods for their analysis. All these resources continue to make a substantial and growing positive impact on the ability of academic researchers and industry in the UK and worldwide to identify the most appropriate patients for their medicines in development.
While large-scale genomics and health informatics have natural applications in common disease such as type 2 diabetes or coronary heart disease, it is perhaps among rare diseases that these datasets can have greatest impact. Information about a handful of people with a rare disease collected by one hospital may be able to offer some insights into how that disease progresses. However, far greater power is achieved when cohorts or biobanks including large numbers of people with that rare disease are established and accrue information over several years. An illustrative example is found in pulmonary arterial hypertension (PAH), a rare disease causing high blood pressure in the arteries in the lungs. Although treatments are available for PAH, these cannot currently cure the disease and in many patients their effectiveness declines over time or they do not work at all. The National Cohort Study of Idiopathic and Heritable PAH was established to collect information about patients cared for the UK’s specialist PAH centres, and is supported by the MRC, British Heart Foundation, NIHR and the UK Pulmonary Hypertension Association. The cohort continues to recruit and currently holds information about nearly 800 people with PAH, 100 of their relatives and 100 people without PAH, known as controls. It collects information about the participants’ health, their treatment and, in a proportion, their genomes. Participants are also able to agree to storage of some of their blood cells for use in laboratory studies. The cohort has generated valuable insights into the progression of PAH, its genetic drivers and opportunities for developing new treatments. Prof Martin Wilkins, professor of clinical pharmacology at Imperial College London and Prof Nick Morrell at the University of Cambridge, both international PAH experts, continue to lead this cohort and use the data for ground-breaking research, including clinical trials.
Prof Wilkins commented: “The UK PAH Cohort study has been world-leading on defining the genetics underlying PAH and highlighting potential new drug targets. Patients can consent to recall by genotype, which enables them to become aware of research opportunities at an appropriate stage of drug development and helps maximise the opportunity to understand the benefit: harm signal in a clinical trial.“
Data resources like the PAH cohort not only enable deep investigation of disease but also offer opportunities to facilitate enrolment of patients into clinical trials. This has clear utility in trials of rare diseases such as PAH, but as data resources enable identification of subtypes of common disease they may also help to recruit those patients into trials. The NIHR Bioresource is a nationwide project that enrols patients with rare and common disease and collects clinical and genomic data. Importantly, Bioresource participants can consent to being contacted about joining clinical trials based on their genetic or clinical information. Using rich data resources first to define a group of patients and then to enrol them into clinical trials, is well placed to increase the reliability and reproducibility of clinical trial outcomes for new medicines. Couple this with the greater confidence lent to new targets by human genetics and it is reasonable to hope that the likelihood of trial success should increase.
These are just some of the many examples of UK excellence in health data, and their value for academia and industry – and therefore for patients - is becoming increasingly clear. This, in turn, drives investment with a growing number of successful partnerships between the NHS, academia and industry grows (see here for an example). While challenges undeniably exist in generating, using and protecting data in the pursuit of more successful clinical trials and new medicines, these are being progressively overcome. Initiatives such as Data Saves Lives and patient-focussed organisations such as UseMyData are helping to ensure that health data are available for high quality research that seeks to improve patients’ lives. The UK is proving itself to be a global leader in the vision and execution of data-driven improvements in health research and, critically, in the discovery and development of new medicines.
Daniel Swerdlow is Senior Medical Director at Silence Therapeutics UK, working in translational medicine and early clinical development.