Advances in intelligent systems and computing, Journal Year: 2014, Volume and Issue: unknown, P. 577 - 588
Published: Sept. 29, 2014
Language: Английский
Advances in intelligent systems and computing, Journal Year: 2014, Volume and Issue: unknown, P. 577 - 588
Published: Sept. 29, 2014
Language: Английский
Nature Reviews Genetics, Journal Year: 2015, Volume and Issue: 16(6), P. 321 - 332
Published: May 7, 2015
Language: Английский
Citations
1635American Journal of Epidemiology, Journal Year: 2019, Volume and Issue: unknown
Published: Aug. 15, 2019
Abstract Machine learning is a branch of computer science that has the potential to transform epidemiologic sciences. Amid growing focus on “Big Data,” it offers epidemiologists new tools tackle problems for which classical methods are not well-suited. In order critically evaluate value integrating machine algorithms and existing methods, however, essential address language technical barriers between two fields can make difficult read assess studies. Here, we provide an overview concepts terminology used in literature, encompasses diverse set with goals ranging from prediction classification clustering. We brief introduction 5 common 4 ensemble-based approaches. then summarize applications techniques published literature. recommend approaches incorporate research discuss opportunities challenges methods.
Language: Английский
Citations
446International Journal of Methods in Psychiatric Research, Journal Year: 2017, Volume and Issue: 26(3)
Published: July 4, 2017
The US Veterans Health Administration (VHA) has begun using predictive modeling to identify at high suicide risk target care. Initial analyses are reported here.A penalized logistic regression model was compared with an earlier proof-of-concept model. Exploratory then considered commonly-used machine learning algorithms. Analyses were based on electronic medical records for all 6,360 individuals classified in the National Death Index as having died by fiscal years 2009-2011 who used VHA services year of their death or prior and a 1% probability sample time-matched service users alive index date (n = 2,112,008).A 61 predictors had sensitivity comparable (which 381 predictors) thresholds. algorithms relatively similar sensitivities, highest being Bayesian additive trees, 10.7% suicides occurred among 1.0% predicted 28.1% 5.0% risk.Based these results, is initial intervention implementation. paper concludes discussion other practical issues that might be explored increase performance.
Language: Английский
Citations
173Epidemiology and Psychiatric Sciences, Journal Year: 2016, Volume and Issue: 26(1), P. 22 - 36
Published: Jan. 26, 2016
Backgrounds. Clinicians need guidance to address the heterogeneity of treatment responses patients with major depressive disorder (MDD). While prediction schemes based on symptom clustering and biomarkers have so far not yielded results sufficient strength inform clinical decision-making, big data predictive analytic models might be more practically useful. Method. We review evidence suggesting that equations symptoms other easily-assessed features found in previous research predict MDD outcomes provide a foundation for developing decision support could help clinicians select optimal (personalised) treatments. These methods also useful targeting patient subsamples expensive biomarker assessments. Results. Approximately two dozen baseline variables obtained from medical records or reports been repeatedly trials overall (i.e., intervention v. control) differential A B). Similar has observational studies persistence-severity. However, no yet attempted develop outcome using full set these predictors. Promising preliminary empirical coupled recent developments statistical methodology suggest developed personalised selection. tools strong increase power focused response subsequent controlled trials. Conclusions. Coordinated efforts are needed protocol systematically collecting information about established predictors large studies, applying refining pragmatic trials, carrying out pooled secondary analyses extract maximum amount coordinated this focus future discovery segment population which continued uncertainty exists.
Language: Английский
Citations
170PLoS Genetics, Journal Year: 2014, Volume and Issue: 10(11), P. e1004754 - e1004754
Published: Nov. 13, 2014
Compared to univariate analysis of genome-wide association (GWA) studies, machine learning-based models have been shown provide improved means learning such multilocus panels genetic variants and their interactions that are most predictive complex phenotypic traits.Many applications modeling rely on effective variable selection, often implemented through model regularization, which penalizes the complexity enables predictions in individuals outside training dataset.However, different regularization approaches may also lead considerable differences, especially number needed for maximal accuracy, as illustrated here examples from both disease classification quantitative trait prediction.We highlight potential pitfalls regularized models, related issues overfitting data, overoptimistic prediction results, well identifiability variants, is important many medical applications.While risk human diseases used a motivating use case, we argue these widely applicable nonhuman applications, animal plant breeding, where accurate genotype-to-phenotype needed.Finally, discuss some key future advances, open questions challenges this developing field, when moving toward low-frequency cross-phenotype interactions.
Language: Английский
Citations
147Journal of Biomedical Informatics, Journal Year: 2018, Volume and Issue: 85, P. 30 - 39
Published: July 29, 2018
Language: Английский
Citations
105Computers in Biology and Medicine, Journal Year: 2021, Volume and Issue: 131, P. 104249 - 104249
Published: Feb. 2, 2021
Language: Английский
Citations
93Natural hazards and earth system sciences, Journal Year: 2021, Volume and Issue: 21(2), P. 807 - 822
Published: March 1, 2021
Abstract. Pre-disaster planning and mitigation necessitate detailed spatial information about flood hazards their associated risks. In the US, Federal Emergency Management Agency (FEMA) Special Flood Hazard Area (SFHA) provides important areas subject to flooding during 1 % riverine or coastal event. The binary nature of hazard maps obscures distribution property risk inside SFHA residual outside SFHA, which can undermine efforts. Machine learning techniques provide an alternative approach estimating across large scales at low computational expense. This study presents a pilot for Texas Gulf Coast region using random forest classification predict probability 30 523 km2 area. Using record National Insurance Program (NFIP) claims dating back 1976 high-resolution geospatial data, we generate continuous map 12 US Geological Survey (USGS) eight-digit hydrologic unit code (HUC) watersheds. Results indicate that model predicts with high sensitivity (area under curve, AUC: 0.895), especially compared existing FEMA regulatory floodplain. Our identifies 649 000 structures least annual chance flooding, roughly 3 times more than are currently identified by as flood-prone.
Language: Английский
Citations
57Journal of the American Medical Informatics Association, Journal Year: 2013, Volume and Issue: 20(4), P. 630 - 636
Published: Feb. 9, 2013
Epistasis has been historically used to describe the phenomenon that effect of a given gene on phenotype can be dependent one or more other genes, and is an essential element for understanding association between genetic phenotypic variations. Quantifying epistasis orders higher than two very challenging due both computational complexity enumerating all possible combinations in genome-wide data lack efficient effective methodologies.In this study, we propose fast, non-parametric, model-free measure three-way epistasis.Such based information gain, able separate lower order effects from pure epistasis.Our method was verified synthetic applied real candidate-gene study tuberculosis West African population. In data, found statistically significant epistatic interaction stronger any lower-order associations.Our provides methodological basis detecting characterizing high-order gene-gene interactions studies.
Language: Английский
Citations
78Journal of Medical Systems, Journal Year: 2019, Volume and Issue: 43(2)
Published: Jan. 5, 2019
Language: Английский
Citations
71