Data mining and machine learning techniques for neonatal EEG event recall

Thumbnail Image
Murphy, Brian M.
Journal Title
Journal ISSN
Volume Title
University College Cork
Published Version
Research Projects
Organizational Units
Journal Issue
Sick neonates admitted to the neonatal intensive care unit (NICU) have their physiological signals monitored. In the case of neonates with brain injury the electroencephalogram (EEG), used to record the electrical activity of the brain, is an important diagnostic tool. The EEG is a non-invasive procedure where electrodes are placed on the skin of the head of the neonate. The EEG signals are difficult to interpret and experienced neurophysiologists are required to interpret the EEG and assess the brain health of a neonate. However, there is a lack of expertise available in the NICU to actively monitor all the patients. There are a wide variety of neonatal EEG patterns that a neurophysiologist must be able to identify to diagnose an encephalopathy and treat a neonate. Some patterns may be rarer than others and require additional time to identify the meaning or cause of the pattern. Neurophysiologists may see a pattern and realise they have seen it before. The difficulty is that they may not be able to recall where they have previously seen the pattern. Currently, the only option is to search through atlases of EEG or prior patient's EEG records to find a similar pattern, which is a time-consuming process. The main aim of this thesis is the development of a system that assists experts in finding similar EEG events from a database of previously recorded events. The idea is that the system will speed up the time it takes an expert to find where they have previously seen a particular neonatal EEG pattern. The current state of the art for automated neonatal EEG analysis tools focus on the classification of the signals. These approaches excel at classifying specific signal types such as seizure or sleep states, but they cannot assist the neurophysiologist in finding a prior patient's records that had the most similar EEG pattern type. There is a requirement for a system that will assist experts in locating similar events that have previously occurred. A system like this could speed up the diagnosis of encephalopathies that have a specific morphology. The first set of data mining techniques developed mimics experts having to physically search back through old records. To achieve this, systems were developed that look through the entire database of events to find the closest matching event. Distance metrics are used to determine the best match. Two distance metric systems were developed, the first was the fixed point to point Euclidean distance and the second was the elastic dynamic time warping (DTW) distance. The second set of data mining techniques developed move towards systems that do not need to examine every event in the database, while maintaining the recall accuracy. This is of particular interest as the amount of data grows because it becomes infeasible to compare the query event to every event in the database. The particular systems developed, generate hashes from the data and these hashes are then used to find a match. A hash is an alternative and compressed representation of the original data. Three different hashing techniques were developed for use with neonatal EEG. The final section of the thesis is in the area of machine learning and it focuses on the development of two multi-class classifiers to classify different neonatal EEG event classes. As it is expensive and time consuming for a neurophysiologist to evaluate neonatal EEG, a proxy system was developed to evaluate the approaches developed in this thesis. As opposed to finding the nearest matching event, the proxy used was that of a multi-class classifier problem. The work in this thesis shows that neonatal EEG recall systems are possible. They can be quicker than having a neurophysiologist physically search for the most similar signal. This thesis highlights the importance of compression and shows why brute force search strategies will not scale well. The strength of hashing systems in terms of recall accuracy, query speed and memory requirements are also shown.
Data mining , Machine learning , EEG , Neonatal , Hashing
Murphy, B. 2019. Data mining and machine learning techniques for neonatal EEG event recall. PhD Thesis, University College Cork.