Machine learning techniques for identifying early-life biomarkers in perinatal & child health

Loading...
Thumbnail Image
Files
Date
2022
Authors
O'Boyle, Daragh
Journal Title
Journal ISSN
Volume Title
Publisher
University College Cork
Published Version
Research Projects
Organizational Units
Journal Issue
Abstract
Artificial Intelligence (AI), and more specifically machine learning (ML), has been used in the investigation of biomarkers for many clinical conditions, reducing the need for specialist diagnosis, reducing waiting times and increasing access to reliable diagnostics. There are numerous areas yet to benefit from its application, particularly in the fields of perinatal and paediatric research. Two such brain related conditions, which will be the focus of this thesis are Hypoxic Ischemic Encephalopathy (HIE) and Autism Spectrum Disorder (ASD). HIE is a major cause of neurological disability globally and results from a lack of oxygen to the brain during and immediately after birth. The mainstay of treatment is therapeutic hypothermia, which, to be effective it must be applied within 6 hours of birth. This thesis aims to improve the early identification of infants eligible for therapeutic hypothermia using AI with clinical and metabolomic HIE biomarkers. Autism constitutes a group of neurodevelopmental disorders characterized by behavioural and cognitive symptoms. The underlying aetiology behind autism remains unclear and reliable predictive biomarkers are lacking. Intervention from an early age has been shown to reduce symptoms but diagnosis frequently occurs outside the window for effective treatment. Despite proven benefits in patient outcomes with intervention within the first two years of life, diagnosis often doesn’t occur until an individual is 3 or 4 years old, and in many cases much older. There is a pressing need for new methods to identify neonates most at risk to provide adequate treatment which improves long term outcomes. In the first experimental section, chapters 2-5, we have applied ML to first identity the optimum predictive clinical and metabolomic biomarkers for the identification of those with HIE. We initially assessed clinical variables using ML feature ranking and modelling. This study identified markers of a newborn’s condition at birth; Apgar scores, need for resuscitation, first measurements of pH and base deficit, as the most predictive. These models achieved an area under the receiver operator characteristic curve (AUROC) of 0.89 when distinguishing between those with perinatal asphyxia, who do not require treatment, and those with HIE. Furthermore, we then assessed a panel of promising metabolite markers for their predictive capabilities with and without clinical markers. ML identified metabolites alanine and lactate as the most predictive of HIE development and when combined with Apgar scores a measure of a newborn’s condition, at 1 and 5 minutes after birth, achieved a predictive AUROC of 0.96. These studies have successfully identified alanine as a candidate metabolite for cotside HIE risk assessment as well as displaying that ML models can improve on our current ability to identify those in need of therapeutic hypothermia. To validate these findings two studies were undertaken. The first independently compared alanine levels in cord blood of those with HIE and controls. This study has displayed elevated levels of alanine at birth and up to 6 hours after, successfully validating alanine as an early life marker for those with HIE in need of neuroprotective therapy. The second validation study successfully validated our algorithm for the prediction of HIE in a large diverse cohort comprised of infants with a range of differing conditions, compared to a set of HIE cases and controls previously assessed. Here 243 infants were assessed, using our model, to determine risk of HIE and an accuracy of 85% was maintained. This study has successfully validated our model's ability to retain performance when applied to a diverse, real-world cohort. In this section, we have successfully displayed the use of ML for improving HIE diagnostics and validated these findings. Further, larger validation studies are currently underway with the end goal of clinical use for determination of those in need of treatment for HIE. In the second experimental section, chapters 6-8, we aimed to apply ML methods to identify early life biomarkers for autism spectrum disorder. We first conducted a systematic review of all blood-based autism biomarkers. This study successfully catalogued all reported biomarkers and recorded the direction of change of theses markers in those with autism compared to neurotypical controls. This study also applied Genome Wide Association Studies (GWAS) and pathway analysis to test for biological processes which may be implicated at the level of the genome in autism. In chapter 7, ML analysis was applied to metabolomics data from cord blood samples from the Cork BASELINE birth cohort. Discovery and targeted metabolomics were completed on this data. In chapter 8, we applied ML to assessed clinical predictors for autism in the Danish National Birth Cohort, which included data from 500 autism cases and matched controls. We identified markers of maternal health and wellbeing as being important for autism prediction and achieved a prediction accuracy of 0.68 AUROC. Overall, this thesis successfully addressed its’ aims to apply ML methods for the identification of biomarkers and development of prediction models for HIE and autism. We have validated previously identified biomarkers and identified novel clinical and blood-based markers as well as created robust HIE predication models with the ability to improve clinical decision making. Possible future steps this research can follow to further add to the field are outlined within. Overall, this research has added to the growing body of evidence displaying the ability of ML to offer improvements to healthcare and specifically to perinatal and child health. Artificial Intelligence (AI), and more specifically machine learning (ML), has been used in the investigation of biomarkers for many clinical conditions, reducing the need for specialist diagnosis, reducing waiting times and increasing access to reliable diagnostics. There are numerous areas yet to benefit from its application, particularly in the fields of perinatal and paediatric research. Two such brain related conditions, which will be the focus of this thesis are Hypoxic Ischemic Encephalopathy (HIE) and Autism Spectrum Disorder (ASD). HIE is a major cause of neurological disability globally and results from a lack of oxygen to the brain during and immediately after birth. The mainstay of treatment is therapeutic hypothermia, which, to be effective it must be applied within 6 hours of birth. This thesis aims to improve the early identification of infants eligible for therapeutic hypothermia using AI with clinical and metabolomic HIE biomarkers. Autism constitutes a group of neurodevelopmental disorders characterized by behavioural and cognitive symptoms. The underlying aetiology behind autism remains unclear and reliable predictive biomarkers are lacking. Intervention from an early age has been shown to reduce symptoms but diagnosis frequently occurs outside the window for effective treatment. Despite proven benefits in patient outcomes with intervention within the first two years of life, diagnosis often doesn’t occur until an individual is 3 or 4 years old, and in many cases much older. There is a pressing need for new methods to identify neonates most at risk to provide adequate treatment which improves long term outcomes. In the first experimental section, chapters 2-5, we have applied ML to first identity the optimum predictive clinical and metabolomic biomarkers for the identification of those with HIE. We initially assessed clinical variables using ML feature ranking and modelling. This study identified markers of a newborn’s condition at birth; Apgar scores, need for resuscitation, first measurements of pH and base deficit, as the most predictive. These models achieved an area under the receiver operator characteristic curve (AUROC) of 0.89 when distinguishing between those with perinatal asphyxia, who do not require treatment, and those with HIE. Furthermore, we then assessed a panel of promising metabolite markers for their predictive capabilities with and without clinical markers. ML identified metabolites alanine and lactate as the most predictive of HIE development and when combined with Apgar scores a measure of a newborn’s condition, at 1 and 5 minutes after birth, achieved a predictive AUROC of 0.96. These studies have successfully identified alanine as a candidate metabolite for cotside HIE risk assessment as well as displaying that ML models can improve on our current ability to identify those in need of therapeutic hypothermia. To validate these findings two studies were undertaken. The first independently compared alanine levels in cord blood of those with HIE and controls. This study has displayed elevated levels of alanine at birth and up to 6 hours after, successfully validating alanine as an early life marker for those with HIE in need of neuroprotective therapy. The second validation study successfully validated our algorithm for the prediction of HIE in a large diverse cohort comprised of infants with a range of differing conditions, compared to a set of HIE cases and controls previously assessed. Here 243 infants were assessed, using our model, to determine risk of HIE and an accuracy of 85% was maintained. This study has successfully validated our model's ability to retain performance when applied to a diverse, real-world cohort. In this section, we have successfully displayed the use of ML for improving HIE diagnostics and validated these findings. Further, larger validation studies are currently underway with the end goal of clinical use for determination of those in need of treatment for HIE. In the second experimental section, chapters 6-8, we aimed to apply ML methods to identify early life biomarkers for autism spectrum disorder. We first conducted a systematic review of all blood-based autism biomarkers. This study successfully catalogued all reported biomarkers and recorded the direction of change of theses markers in those with autism compared to neurotypical controls. This study also applied Genome Wide Association Studies (GWAS) and pathway analysis to test for biological processes which may be implicated at the level of the genome in autism. In chapter 7, ML analysis was applied to metabolomics data from cord blood samples from the Cork BASELINE birth cohort. Discovery and targeted metabolomics were completed on this data. In chapter 8, we applied ML to assessed clinical predictors for autism in the Danish National Birth Cohort, which included data from 500 autism cases and matched controls. We identified markers of maternal health and wellbeing as being important for autism prediction and achieved a prediction accuracy of 0.68 AUROC. Overall, this thesis successfully addressed its’ aims to apply ML methods for the identification of biomarkers and development of prediction models for HIE and autism. We have validated previously identified biomarkers and identified novel clinical and blood-based markers as well as created robust HIE predication models with the ability to improve clinical decision making. Possible future steps this research can follow to further add to the field are outlined within. Overall, this research has added to the growing body of evidence displaying the ability of ML to offer improvements to healthcare and specifically to perinatal and child health.
Description
Keywords
AI , Machine learning , Bioinformatics , Biomarker , Autism , HIE
Citation
O'Boyle, D. 2022. Machine learning techniques for identifying early-life biomarkers in perinatal & child health. PhD Thesis, University College Cork.