Statistical and machine learning techniques in human microbiome studies: contemporary challenges and solutions
Pau, Enrique Carrillo De Santa
Frontiers Media S.A.
The human microbiome has emerged as a central research topic in human biology and biomedicine. Current microbiome studies generate high-throughput omics data across different body sites, populations, and life stages. Many of the challenges in microbiome research are similar to other high-throughput studies, the quantitative analyses need to address the heterogeneity of data, specific statistical properties, and the remarkable variation in microbiome composition across individuals and body sites. This has led to a broad spectrum of statistical and machine learning challenges that range from study design, data processing, and standardization to analysis, modeling, cross-study comparison, prediction, data science ecosystems, and reproducible reporting. Nevertheless, although many statistics and machine learning approaches and tools have been developed, new techniques are needed to deal with emerging applications and the vast heterogeneity of microbiome data. We review and discuss emerging applications of statistical and machine learning techniques in human microbiome studies and introduce the COST Action CA18131 "ML4Microbiome" that brings together microbiome researchers and machine learning experts to address current challenges such as standardization of analysis pipelines for reproducibility of data analysis results, benchmarking, improvement, or development of existing and new tools and ontologies.
Machine learning , Microbiome , ML4Microbiome , Personalized medicine , Biomarker identification
Moreno-Indias, I., Lahti, L., Nedyalkova, M., Elbere, I., Roshchupkin, G., Adilovic, M., Aydemir, O., Bakir-Gungor, B., Pau, E. C. D. S., D'Elia, D., Desai, M., Falquet, L., Gundogdu, A., Hron, K., Klammsteiner, T., Lopes, M. B., Marcos-Zambrano, L. J., Marques, C., Mason, M., May, P., Pasic, L., Pio, G., Pongor, S., Promponas, V. J., Przymus, P., Saez-Rodriguez, J., Sampri, A., Shigdel, R., Stres, B., Suharoschi, R., Truu, J., Truica, C-O., Vilne, B., Vlachakis, D. P., Yilmaz, E., Zeller, G., Zomer, A., Gomez-Cabrero, D. and Claesson, M. J. (2021) 'Statistical and machine learning techniques in human microbiome studies: contemporary challenges and solutions', Frontiers In Microbiology, 12, 635781, (9pp). doi: 10.3389/fmicb.2021.635781
© 2021 Moreno-Indias, Lahti, Nedyalkova, Elbere, Roshchupkin, Adilovic, Aydemir, Bakir-Gungor, Santa Pau, D’Elia, Desai, Falquet, Gundogdu, Hron, Klammsteiner, Lopes,Marcos-Zambrano,Marques,Mason,May, Paši´c, Pio, Pongor, Promponas, Przymus, Saez-Rodriguez, Sampri, Shigdel, Stres, Suharoschi, Truu, Truic˘a, Vilne, Vlachakis, Yilmaz, Zeller, Zomer, Gómez-Cabrero and Claesson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.