Interrogating annotated protein coding regions for hitherto undetected translation

dc.check.date2026-12-31
dc.contributor.advisorBaranov, Pavel V.
dc.contributor.authorFedorova, Allaen
dc.contributor.funderScience Foundation Irelanden
dc.date.accessioned2023-09-27T09:01:05Z
dc.date.available2023-09-27T09:01:05Z
dc.date.issued2023en
dc.date.submitted2023
dc.description.abstractRibosome profiling (Ribo-seq) is a technique that allows to capture ribosome protected fragments and sequence them. This powerful method enables discovery of not yet annotated proteoforms and translated open reading frames (ORFs), even ones that are hidden in annotated protein coding regions. Here we employed the Ribo-seq data together with comparative genomics analysis in order to discover non-AUG initiated proteoforms derived via alternative translation start sites that are in-frame with annotated starts. Production of such non-AUG proteoforms can be split into two scenarios. First, some nonAUG proteoforms are generated as alternative proteoforms in addition to annotated AUG- initiated ones. This phenomenon is called PANTs - Proteoforms with Alternative N-termini. The second scenario is when a non-AUG codon is used exclusively as the translation start for the generation of the main protein product from mRNA. In addition to discovery of non-AUG proteoforms, we rebuilt and upgraded an instance of the Galaxy platform for processing Ribo-seq data called RiboGalaxy. This update enabled prediction of novel translated ORFs from raw Ribo-seq reads by using only an internet browser with no need of local software. This update made working with Ribo-seq data more accessible to the scientific community. Chapter 1 is an introductory chapter which describes Proteoforms with Alternative N termini - PANTs. In particular, it covers different sources of PANTs, their functions and methods for their discovery. Chapter 2 covers the development of a pipeline for detection of non-AUG N-terminally extended proteoforms in the human genome which constitutes a phylogenetic approach and Ribo-seq-based approach. It also narrates the discovery of novel non AUG N-terminal extensions using the aforementioned pipeline and an attempt to describe the functionality of those non-AUG N-termini. Chapter 3 describes the phenomenon of exclusive non-AUG initiation when only non AUG initiated proteoform is generated from mRNA unlike Proteoforms with Alternative N-termini (PANTs) when both non-AUG and AUG proteoforms are generated from the same mRNA. Reported proteoforms were analysed and novel candidates predicted using Ribo-seq data. Chapter 4 reports the development of an update of RiboGalaxy - an interactive user friendly online platform for the processing Ribo-seq data which covers all the steps from preprocessing raw reads and quality control to transcriptomic and genomic alignments which then can be visualised and analysed in Trips-viz and GWIPS-viz - transcriptomic and genomic browsers for ribosome profiling data which altogether comprise the Riboseq.org resource. This platform enables preparing ribosome profiling data for subsequent detection of translated ORFs in Trips-viz. This update includes its backend moving to configuration manager (ansible), updating tools, their dependencies and reference indices and adding novel tools that allow to prepare files for easy upload to GWIPs-viz and Trips-viz.en
dc.description.statusNot peer revieweden
dc.description.versionAccepted Versionen
dc.format.mimetypeapplication/pdfen
dc.identifier.citationFedorova, A. 2023. Interrogating annotated protein coding regions for hitherto undetected translation. PhD Thesis, University College Cork.
dc.identifier.endpage222
dc.identifier.urihttps://hdl.handle.net/10468/15031
dc.language.isoenen
dc.publisherUniversity College Corken
dc.relation.projectinfo:eu-repo/grantAgreement/SFI/SFI Centres for Research Training Programme::Data and ICT Skills for the Future/18/CRT/6214/IE/SFI Centre for Research Training in Genomics Data Science/
dc.rights© 2023, Alla Fedorova.
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectRibosome profiling
dc.subjectRibo-seq
dc.subjectNonAUG initiation
dc.subjectNon-canonical translation
dc.subjectEvolution
dc.titleInterrogating annotated protein coding regions for hitherto undetected translation
dc.typeDoctoral thesisen
dc.type.qualificationlevelDoctoralen
dc.type.qualificationnamePhD - Doctor of Philosophyen
Files
Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
FedorovaAD_PhD2023.pdf
Size:
63.91 MB
Format:
Adobe Portable Document Format
Description:
Full Text E-thesis
Loading...
Thumbnail Image
Name:
Submission for Examination Form
Size:
722.88 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
5.2 KB
Format:
Item-specific license agreed upon to submission
Description: