Artificial Intelligence in Medicine, Journal Year: 2017, Volume and Issue: 85, P. 43 - 49
Published: Sept. 22, 2017
Language: Английский
Artificial Intelligence in Medicine, Journal Year: 2017, Volume and Issue: 85, P. 43 - 49
Published: Sept. 22, 2017
Language: Английский
Nature Communications, Journal Year: 2019, Volume and Issue: 10(1)
Published: July 25, 2019
Abstract A historical tendency to use European ancestry samples hinders medical genetics research, including the of polygenic scores, which are individual-level metrics genetic risk. We analyze first decade scoring studies (2008–2017, inclusive), and find that 67% included exclusively participants another 19% only East Asian participants. Only 3.8% were among cohorts African, Hispanic, or Indigenous peoples. predictive performance ancestry-derived scores is lower in non-European (e.g. African samples: t = −5.97, df 24, p 3.7 × 10 −6 ), we demonstrate effects methodological choices score distributions for worldwide populations. These findings highlight need improved treatment linkage disequilibrium variant frequencies when applying ancestry, bolster rationale large-scale GWAS diverse human
Language: Английский
Citations
894Frontiers in Bioinformatics, Journal Year: 2022, Volume and Issue: 2
Published: June 27, 2022
Machine learning has shown utility in detecting patterns within large, unstructured, and complex datasets. One of the promising applications machine is precision medicine, where disease risk predicted using patient genetic data. However, creating an accurate prediction model based on genotype data remains challenging due to so-called “curse dimensionality” (i.e., extensively larger number features compared samples). Therefore, generalizability models benefits from feature selection, which aims extract only most “informative” remove noisy “non-informative,” irrelevant redundant features. In this article, we provide a general overview different selection methods, their advantages, disadvantages, use cases, focusing detection relevant SNPs) for prediction.
Language: Английский
Citations
406Molecular Psychiatry, Journal Year: 2018, Volume and Issue: 24(3), P. 409 - 420
Published: Jan. 9, 2018
Language: Английский
Citations
360Frontiers in Genetics, Journal Year: 2019, Volume and Issue: 10
Published: March 27, 2019
In the past decade, precision genomics based medicine has emerged to provide tailored and effective healthcare for patients depending upon their genetic features. Genome Wide Association Studies have also identified population risk variants common complex diseases. order meet full promise of medicine, research is attempting leverage our increasing genomic understanding further develop personalized medical through ever more accurate disease prediction models. Polygenic scoring machine learning are two primary approaches prediction. Despite recent improvements, results polygenic remain limited due that currently used. By contrast, algorithms increased predictive abilities risk. This increase in from ability handle multi-dimensional data. Here, we an overview We highlight application developments describe how can lead improved prediction, which will help incorporate features into future healthcare. Finally, discuss models might manage by providing tissue-specific targets customized, preventive interventions.
Language: Английский
Citations
181Frontiers in Genetics, Journal Year: 2018, Volume and Issue: 9
Published: July 4, 2018
The analysis of large genomic data is hampered by issues such as a small number observations and predictive variables (commonly known "large P N"), high dimensionality or highly correlated structures. Machine learning methods are renowned for dealing with these problems. To date machine have been applied in Genome-Wide Association Studies identification candidate genes, epistasis detection, gene network pathway analyses prediction phenotypic values. However, the utility two methods, Gradient Boosting (GBM) Extreme Method (XgBoost), identifying subset SNP makers breeding values has never explored before. In this study, using 38,082 markers body weight phenotypes from 2,093 Brahman cattle (1,097 bulls discovery population 996 cows validation population), we examined efficiency three namely Random Forests (RF), GBM XgBoost, (a) top 400, 1,000, 3,000 ranked SNPs; (b) subsets SNPs to construct relationship matrices (GRMs) estimation (GEBVs). For comparison purposes, also calculated GEBVs (1) that were randomly selected evenly spaced across genome, (2) all SNPs. We found RF especially efficient direct links genes affecting growth trait. estimate accuracy (0.43), identified (0.42) (0.46) had similar those whole panel. performance was substantially better than genome (0.18-0.29). Of consistently outperformed XgBoost accuracy.
Language: Английский
Citations
168Advances in genetics, Journal Year: 2019, Volume and Issue: unknown, P. 75 - 154
Published: Jan. 1, 2019
Language: Английский
Citations
149Nature, Journal Year: 2023, Volume and Issue: 616(7955), P. 123 - 131
Published: March 29, 2023
Language: Английский
Citations
79Nature Methods, Journal Year: 2023, Volume and Issue: 20(6), P. 803 - 814
Published: May 29, 2023
Language: Английский
Citations
46Scientific Reports, Journal Year: 2024, Volume and Issue: 14(1)
Published: March 11, 2024
Abstract To explore a robust tool for advancing digital breeding practices through an artificial intelligence-driven phenotype prediction expert system, we undertook thorough analysis of 11 non-linear regression models. Our investigation specifically emphasized the significance Support Vector Regression (SVR) and SHapley Additive exPlanations (SHAP) in predicting soybean branching. By using branching data (phenotype) 1918 accessions 42 k SNP (Single Nucleotide Polymorphism) polymorphic (genotype), this study systematically compared AI models, including four deep learning models (DBN (deep belief network) regression, ANN (artificial neural Autoencoders MLP (multilayer perceptron) regression) seven machine (e.g., SVR (support vector regression), XGBoost (eXtreme Gradient Boosting) Random Forest LightGBM GPs (Gaussian processes) Decision Tree Polynomial regression). After being evaluated by valuation metrics: R 2 (R-squared), MAE (Mean Absolute Error), MSE Squared MAPE Percentage it was found that SVR, Regression, DBN, Autoencoder outperformed other could obtain better accuracy when they were used prediction. In assessment approaches, exemplified model, conducting analyses on feature importance gene ontology (GO) enrichment to provide comprehensive support. comprehensively comparing algorithms, no notable distinction observed ranking scores across namely Variable Ranking, Permutation, SHAP, Correlation Matrix, but SHAP value rich information genes with negative contributions, chosen selection. The results offer valuable insights into AI-mediated plant breeding, addressing challenges faced traditional programs. method developed has broad applicability prediction, minor QTL (quantitative trait loci) mining, smart-breeding systems, contributing significantly advancement AI-based transitioning from experience-based data-based breeding.
Language: Английский
Citations
33The Lancet Oncology, Journal Year: 2016, Volume and Issue: 18(1), P. 132 - 142
Published: Nov. 16, 2016
Language: Английский
Citations
139