A bioinformatics approach to identify novel long, non-coding RNAs in breast cancer cell lines from an existing RNA-sequencing dataset

Loading...
Thumbnail Image
Date
2020-03-17
Authors
Zaheed, Oza
Samson, Julia
Dean, Kellie
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier
Research Projects
Organizational Units
Journal Issue
Abstract
Breast cancer research has traditionally centred on genomic alterations, hormone receptor status and changes in cancer-related proteins to provide new avenues for targeted therapies. Due to advances in next generation sequencing technologies, there has been the emergence of long, non-coding RNAs (lncRNAs) as regulators of normal cellular events, with links to various disease states, including breast cancer. Here we describe our bioinformatic analyses of a previously published RNA sequencing (RNA-seq) dataset to identify lncRNAs with altered expression levels in a subset of breast cancer cell lines. Using a previously published RNA-seq dataset of 675 cancer cell lines, a subset of 18 cell lines was selected for our analyses that included 16 breast cancer lines, one ductal carcinoma in situ line and one normal-like breast epithelial cell line. Principal component analysis demonstrated correlation with well-established categorisation methods of breast cancer (i.e. luminal A/B, HER2 enriched and basal-like A/B). Through detailed comparison of differentially expressed lncRNAs in each breast cancer sub-type with normal-like breast epithelial cells, we identified 15 lncRNAs with consistently altered expression, including three uncharacterised lncRNAs. Utilising data from The Cancer Genome Atlas (TCGA) and The Genotype Tissue Expression (GETx) project via Gene Expression Profiling Interactive Analysis (GEPIA2), we assessed clinical relevance of several identified lncRNAs with invasive breast cancer. Lastly, we determined the relative expression level of six lncRNAs across a spectrum of breast cancer cell lines to experimentally confirm the findings of our bioinformatic analyses. Overall, we show that the use of existing RNA-seq datasets, if re-analysed with modern bioinformatic tools, can provide a valuable resource to identify lncRNAs that could have important biological roles in oncogenesis and tumour progression.
Description
Keywords
Bioinformatics , Breast cancer , Ductal carcinoma in situ , Long non-coding RNAs (lncRNAs) RNA sequencing (RNA-seq) , Quantitative reverse transcriptase polymerase chain reaction (qRT-PCR)
Citation
Zaheed, O., Samson, J. and Dean, K. (2020) ‘A bioinformatics approach to identify novel long, non-coding RNAs in breast cancer cell lines from an existing RNA-sequencing dataset’, Non-coding RNA Research, 5(2), pp. 48–59. https://doi.org/10.1016/j.ncrna.2020.02.004.
Link to publisher’s version