Cloud-based machine learning architecture for big data analysis

dc.check.embargoformatEmbargo not applicable (If you have not submitted an e-thesis or do not want to request an embargo)en
dc.check.infoNot applicableen
dc.check.opt-outNoen
dc.check.reasonNot applicableen
dc.check.typeNo Embargo Required
dc.contributor.advisorHerbert, Johnen
dc.contributor.authorPakdel, Rezvan
dc.contributor.funderIrish Research Council for Science, Engineering and Technologyen
dc.date.accessioned2020-01-10T11:56:24Z
dc.date.available2020-01-10T11:56:24Z
dc.date.issued2019
dc.date.submitted2019
dc.description.abstractThe use of machine-learning that leverages large amounts of data (big data) is increasingly important in many areas of business and research. To help cope with the demanding resources required by these applications, solutions including hardware platforms (e.g. graphics cards), more efficient algorithms (e.g. deep learning algorithms), and special software environments (e.g. tensor flow) have been developed. In addition, for specific applications, special optimisations are often developed based on the requirements of the particular application. This thesis also addresses the challenge of efficiency of machine learning over big data but does so in a way that is complementary to specialised hardware and algorithms, and in a way that is also independent of application and data type. The thesis has developed several types of general optimisations and implemented these on top of an underlying generic machine learning architecture. The generic machine learning architecture includes stages for segmentation, feature extraction, model building and classification. The optimisation components enhance this architecture in a general way that works with any datatype and any dataset, and where the optimisation responds to the needs of the particular application, and is self-adjusting for the particular dataset being processed. The optimisations developed are: model optimisation; feature optimisation; resources optimisation; cloud platform cost-benefit optimisation. Model optimisation involves evaluating multiple models in parallel, and using feedback on model performance to choose the best ones based on the dataset being processed. Feature optimisation involves evaluating various features and combinations of features, and then choosing those features that are most effective for classification. Resources optimisation involves dynamically adjusting compute instances to respond to the demands of an application. Cloud platform cost-benefit optimisation involves evaluating the cost of available public cloud compute instances, and determining appropriate cost-efficient instances depending on the needs of an application. General techniques of sampling, evaluation and feedback are used in several optimisation components. The underlying framework and optimisations have been implemented and deployed in a private cloud environment. Evaluation on various datasets ( image and text datasets) has shown these optimisation components to be effective, and provide useful generic components that can work in conjunction with other optimisations to address the challenging demands of machine learning over big data.en
dc.description.statusNot peer revieweden
dc.description.versionAccepted Version
dc.format.mimetypeapplication/pdfen
dc.identifier.citationPakdel, R. 2019. Cloud-based machine learning architecture for big data analysis. PhD Thesis, University College Cork.en
dc.identifier.endpage126en
dc.identifier.urihttps://hdl.handle.net/10468/9481
dc.language.isoenen
dc.publisherUniversity College Corken
dc.rights© 2019, Rezvan Pakdel.en
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/en
dc.subjectBig-dataen
dc.subjectCloud-computingen
dc.subjectMachine learningen
dc.thesis.opt-outfalse
dc.titleCloud-based machine learning architecture for big data analysisen
dc.typeDoctoral thesisen
dc.type.qualificationlevelDoctoralen
dc.type.qualificationnamePhDen
ucc.workflow.supervisorj.herbert@ucc.ie
Files
Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
PhD_Thesis.pdf
Size:
9.87 MB
Format:
Adobe Portable Document Format
Description:
Full Text E-thesis
Loading...
Thumbnail Image
Name:
PhD_Thesis_abstract.pdf
Size:
61.71 KB
Format:
Adobe Portable Document Format
Description:
Abstract
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
5.62 KB
Format:
Item-specific license agreed upon to submission
Description: