OP-10 New machine learning approaches to estimate the functional consequence of mutations in diverse human populations
Presenting Author: Yuval Itan, Icahn School of Medicine at Mount Sinai
Co-Author(s): Cigdem Sevim Bayrak, Icahn School of Medicine at Mount Sinai Yiming Wu, Icahn School of Medicine at Mount Sinai David Stein, Icahn School of Medicine at Mount Sinai David Cooper, Cardiff University Peter Stenson, Cardiff University Avner Schlessinger, Icahn School of Medicine at Mount Sinai Avner Schlessinger, Icahn School of Medicine at Mount Sinai Judy Cho, Icahn School of Medicine at Mount Sinai
Abstract: The genome of a patient with a genetic disease contains about 20,000 non-synonymous variations, of which only one (or a few) is disease-causing. Current computational methods cannot predict the functional consequence of a mutation: whether it results in gain-of-function (GOF) or loss-of-function (LOF). Moreover, computational predictions of mutation pathogenicity are still lacking specificity when analyzing diverse human genetic data. Here we present two novel approaches to address these shortcomings: (1) a machine learning study to computationally differentiate GOF from LOF mutations, using natural language processing (NLP) and feature selection to generate the first large-scale human inherited GOF and LOF mutation database; and (2) a deep learning neural network approach to classify mutations by the human phenotype ontology (HPO) disease group. We demonstrate the utility of our combining our state-of-the-art with gold standard methods in case-control studies across different diseases including severe COVID-19 and inflammatory bowel disease (IBD), where we discovered novel genetic etiologies.