Professor of Health Informatics
What are your research interests?
My research focuses on data science methods for personalised preventative health, in particular clinical prediction models. These models use information about an individual (such as their age and sex, medical history, diagnoses and treatments, and measured biomarkers) to predict their current risk of a specific health outcome over a defined time period. For instance, there exist models that predict an individual’s risk of incident cardiovascular disease (heart attack or stroke) over the next 10 years. We use these models in healthcare to decide which people need, and which people do not need, preventative treatment. For instance, people with a high risk of incident cardiovascular disease would receive statins.
What is the focus of your current research?
There are 4 areas of focus in our current research. 1. Holistic prediction across multiple health outcomes. This is particularly important for people with multimorbidity. 2. Causal prediction, moving away from predicting risks to predicting benefits of treatments. 3. Prediction systems that are capable of adapting to changes over time. 4. Developing and validating prediction models using electronic health records, which contains many missing data and are typically populated using a highly selective, informative sampling process.
What are some projects or breakthroughs you wish to highlight?
Some recent papers from our group:
Tsvetanova A, Sperrin M, Peek N, et al. Missing data was handled inconsistently in UK prediction models: a review of methods used. J Clin Epidemiol. 2021 Sep;140:149-58.
Collins SD, Peek N, Riley RD, Martin GP. Sample sizes of prediction model studies in prostate cancer were rarely justified and often insufficient. J Clin Epidemiol. 2021 May;133:53-60.
Lin L, Sperrin M, Jenkins DA, et al. A scoping review of causal methods enabling predictions under hypothetical interventions. Diagn Progn Res. 2021 Feb;5(1):3.
Jenkins DA, Martin GP, Sperrin M, et al. Continual updating and monitoring of clinical prediction models: time for dynamic prediction systems? Diagn Progn Res. 2021 Jan;5(1):1.
Sisk R, Lin L, Sperrin M, et al. Informative presence and observation in routine health data: A review of methodology for clinical risk prediction. J Am Med Inform Assoc. 2021;28(1):155-66.
Sperrin M, Martin GP, Sisk R, et al. Missing data should be handled differently for prediction than for description or causal explanation. J Clin Epidemiol. 2020 Sep;125:183-7.
Sperrin M, Grant SW, Peek N. Prediction models for diagnosis and prognosis in Covid-19. BMJ. 2020 Apr;369:m1464.
Sperrin M, Jenkins D, Martin GP, et al. Explicit causal reasoning is needed to prevent prognostic models being victims of their own success. J Am Med Inform Assoc. 2019 Dec;26(12):1675-6.
Sperrin M, Martin GP, Pate A, et al. Using marginal structural models to adjust for treatment drop-in when developing clinical prediction models. Stat Med. 2018 Dec;37(28):4142-54.
What memberships and awards do you hold/have you held in the past?
I am a member of the European Lab for Learning and Intelligent Systems (ELLIS), a fellow of the Alan Turing Institute, and a fellow of the American College of Medical Informatics (ACMI), and a fellow of the International Academy of Health Sciences Informatics (IAHSI).
What is the biggest challenge in Data Science and AI right now?
The opportunities of Data Science and AI are well recognised and huge financial investments have been made in the last decade. It is now important that these investments pay off. Data Science and AI need to provide tangible benefits to individuals and society, otherwise there is a risk that people become disappointed and that we would enter another “AI winter”. Certainly in my field (health and care) these tangible benefits have not yet materialised and it is proving harder than many people expected.
What real world challenges do you see Data Science and AI meeting in the next 25 years?
Causal inference is essential for personalised health: we need to be able to predict the causal effects of treatments in individuals in order to assess whether they would benefit from these treatments. People are capable of making accurate causal inferences about the world using a relatively small number of examples. We are still far away from being able to do that with computers, but we are definitely making progress. There has been a revolution in causal inference methods in the last 25 years, led by thought leaders such as Judea Pearl. Therefore I expect to see a lot of progress on this front in the next 25 years, enabling better decision making and better health outcomes for many people.