Comparative analysis of chemical descriptors by machine learning reveals atomistic insights into solute–lipid interactions

dc.contributor.authorLange, Justus Johannen
dc.contributor.authorAnelli, Andreaen
dc.contributor.authorAlsenz, Jochemen
dc.contributor.authorKuentz, Martinen
dc.contributor.authorO'Dwyer, Patrick J.en
dc.contributor.authorSaal, Wiebkeen
dc.contributor.authorWyttenbach, Nicoleen
dc.contributor.authorGriffin, Brendan T.en
dc.contributor.funderH2020 Marie Skłodowska-Curie Actionsen
dc.contributor.funderHorizon 2020en
dc.date.accessioned2024-06-11T15:33:24Z
dc.date.available2024-06-11T15:33:24Z
dc.date.issued2024-05-23en
dc.description.abstractThis study explores the research area of drug solubility in lipid excipients, an area persistently complex despite recent advancements in understanding and predicting solubility based on molecular structure. To this end, this research investigated novel descriptor sets, employing machine learning techniques to understand the determinants governing interactions between solutes and medium-chain triglycerides (MCTs). Quantitative structure-property relationships (QSPR) were constructed on an extended solubility data set comprising 182 experimental values of structurally diverse drug molecules, including both development and marketed drugs to extract meaningful property relationships. Four classes of molecular descriptors, ranging from traditional representations to complex geometrical descriptions, were assessed and compared in terms of their predictive accuracy and interpretability. These include two-dimensional (2D) and three-dimensional (3D) descriptors, Abraham solvation parameters, extended connectivity fingerprints (ECFPs), and the smooth overlap of atomic position (SOAP) descriptor. Through testing three distinct regularized regression algorithms alongside various preprocessing schemes, the SOAP descriptor enabled the construction of a superior performing model in terms of interpretability and accuracy. Its atom-centered characteristics allowed contributions to be estimated at the atomic level, thereby enabling the ranking of prevalent molecular motifs and their influence on drug solubility in MCTs. The performance on a separate test set demonstrated high predictive accuracy (RMSE = 0.50) for 2D and 3D, SOAP, and Abraham Solvation descriptors. The model trained on ECFP4 descriptors resulted in inferior predictive accuracy. Lastly, uncertainty estimations for each model were introduced to assess their applicability domains and provide information on where the models may extrapolate in chemical space and, thus, where more data may be necessary to refine a data-driven approach to predict solubility in MCTs. Overall, the presented approaches further enable computationally informed formulation development by introducing a novel in silico approach for rational drug development and prediction of dose loading in lipids.en
dc.description.statusPeer revieweden
dc.description.versionPublished Versionen
dc.format.mimetypeapplication/pdfen
dc.identifier.articleidacs.molpharmaceut.4c00080en
dc.identifier.citationLange, J.J., Anelli, A., Alsenz, J., Kuentz, M., O’Dwyer, P.J., Saal, W., Wyttenbach, N. and Griffin, B.T. (2024) ‘Comparative analysis of chemical descriptors by machine learning reveals atomistic insights into solute–lipid interactions’, Molecular Pharmaceutics, acs.molpharmaceut.4c00080 (13 pp). Available at: https://doi.org/10.1021/acs.molpharmaceut.4c00080.en
dc.identifier.doihttps://doi.org/10.1021/acs.molpharmaceut.4c00080en
dc.identifier.endpage13en
dc.identifier.issn1543-8384en
dc.identifier.issn1543-8392en
dc.identifier.journaltitleMolecular Pharmaceuticsen
dc.identifier.startpage1en
dc.identifier.urihttps://hdl.handle.net/10468/15997
dc.language.isoenen
dc.publisherACS American Chemical Societyen
dc.relation.ispartofMolecular Pharmaceuticsen
dc.relation.projectinfo:eu-repo/grantAgreement/EC/H2020::MSCA-ITN-EID/955756/EU/A fully integrated, animal-free, end-to-end modelling approach to oral drug product development/InPharmaen
dc.rights© 2024 The Authors. Published by American Chemical Society. This publication is licensed under CC-BY 4.0.en
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en
dc.subjectSmooth overlap of atomic positions (SOAP)en
dc.subjectMachine learningen
dc.subjectSolubility predictionen
dc.subjectLipidsen
dc.subjectLipid based formulationsen
dc.subjectQuantitative-structure−property-relationships (QSPR)en
dc.titleComparative analysis of chemical descriptors by machine learning reveals atomistic insights into solute–lipid interactionsen
dc.typeArticle (peer-reviewed)en
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
lange-et-al-2024-comparative.pdf
Size:
4.09 MB
Format:
Adobe Portable Document Format
Description:
Published version
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.71 KB
Format:
Item-specific license agreed upon to submission
Description: