A distributed architecture for the monitoring and analysis of time series data

Show simple item record

dc.contributor.advisor Morrison, John en
dc.contributor.author O'Reilly, Ruairi Donagh
dc.date.accessioned 2015-12-14T14:57:19Z
dc.date.available 2015-12-14T14:57:19Z
dc.date.issued 2015
dc.date.submitted 2015
dc.identifier.citation O'Reilly, R. D. 2015. A distributed architecture for the monitoring and analysis of time series data. PhD Thesis, University College Cork. en
dc.identifier.endpage 149
dc.identifier.uri http://hdl.handle.net/10468/2139
dc.description.abstract It is estimated that the quantity of digital data being transferred, processed or stored at any one time currently stands at 4.4 zettabytes (4.4 × 2 70 bytes) and this figure is expected to have grown by a factor of 10 to 44 zettabytes by 2020. Exploiting this data is, and will remain, a significant challenge. At present there is the capacity to store 33% of digital data in existence at any one time; by 2020 this capacity is expected to fall to 15%. These statistics suggest that, in the era of Big Data, the identification of important, exploitable data will need to be done in a timely manner. Systems for the monitoring and analysis of data, e.g. stock markets, smart grids and sensor networks, can be made up of massive numbers of individual components. These components can be geographically distributed yet may interact with one another via continuous data streams, which in turn may affect the state of the sender or receiver. This introduces a dynamic causality, which further complicates the overall system by introducing a temporal constraint that is difficult to accommodate. Practical approaches to realising the system described above have led to a multiplicity of analysis techniques, each of which concentrates on specific characteristics of the system being analysed and treats these characteristics as the dominant component affecting the results being sought. The multiplicity of analysis techniques introduces another layer of heterogeneity, that is heterogeneity of approach, partitioning the field to the extent that results from one domain are difficult to exploit in another. The question is asked can a generic solution for the monitoring and analysis of data that: accommodates temporal constraints; bridges the gap between expert knowledge and raw data; and enables data to be effectively interpreted and exploited in a transparent manner, be identified? The approach proposed in this dissertation acquires, analyses and processes data in a manner that is free of the constraints of any particular analysis technique, while at the same time facilitating these techniques where appropriate. Constraints are applied by defining a workflow based on the production, interpretation and consumption of data. This supports the application of different analysis techniques on the same raw data without the danger of incorporating hidden bias that may exist. To illustrate and to realise this approach a software platform has been created that allows for the transparent analysis of data, combining analysis techniques with a maintainable record of provenance so that independent third party analysis can be applied to verify any derived conclusions. In order to demonstrate these concepts, a complex real world example involving the near real-time capturing and analysis of neurophysiological data from a neonatal intensive care unit (NICU) was chosen. A system was engineered to gather raw data, analyse that data using different analysis techniques, uncover information, incorporate that information into the system and curate the evolution of the discovered knowledge. The application domain was chosen for three reasons: firstly because it is complex and no comprehensive solution exists; secondly, it requires tight interaction with domain experts, thus requiring the handling of subjective knowledge and inference; and thirdly, given the dearth of neurophysiologists, there is a real world need to provide a solution for this domain en
dc.format.mimetype application/pdf en
dc.language.iso en en
dc.publisher University College Cork en
dc.rights © 2015, Ruairi D. O'Reilly. en
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/ en
dc.subject Time series en
dc.subject Agent theory en
dc.subject Workflow en
dc.subject Stream computing en
dc.subject Complex event processing en
dc.subject Scientific method en
dc.title A distributed architecture for the monitoring and analysis of time series data en
dc.type Doctoral thesis en
dc.type.qualificationlevel Doctoral en
dc.type.qualificationname PhD (Science) en
dc.internal.availability Full text available en
dc.check.info No embargo required en
dc.description.version Accepted Version
dc.description.status Not peer reviewed en
dc.internal.school Computer Science en
dc.check.type No Embargo Required
dc.check.reason No embargo required en
dc.check.opt-out Not applicable en
dc.thesis.opt-out false
dc.check.embargoformat Not applicable en
ucc.workflow.supervisor j.morrison@cs.ucc.ie
dc.internal.conferring Autumn Conferring 2015

Files in this item

This item appears in the following Collection(s)

Show simple item record

© 2015, Ruairi D. O'Reilly. Except where otherwise noted, this item's license is described as © 2015, Ruairi D. O'Reilly.
This website uses cookies. By using this website, you consent to the use of cookies in accordance with the UCC Privacy and Cookies Statement. For more information about cookies and how you can disable them, visit our Privacy and Cookies statement