Impact of genotype × environment interaction and selection history on genomic prediction in maize (Zea mays L.) DOI Creative Commons
Martin Costa, James B. Holland, Natalia de León

et al.

Crop Science, Journal Year: 2024, Volume and Issue: 64(6), P. 3293 - 3310

Published: Oct. 15, 2024

Abstract Breeders made remarkable progress in improving productivity and stability of cultivars. Breeding relies on selecting favorable alleles for performance to produce productive varieties across diverse environments. In this study, we analyzed the Genomes Fields Initiative 2018–2019 genotype by environment interaction (G × E) dataset, focusing three populations double haploid (DH) lines derived from crossing inbrexpired Plant Variety Protection (ex‐PVP) inbred line PHW65 with PHN11, Mo44, MoG. is an Iodent/Lancaster‐type inbred; PHN11 Iodent type ex‐PVP line; Mo44 a tropical‐derived MoG agronomically poor variety Mastadon. Hybrids were produced resulting DHs Stiff Stalk testers PHT69 LH195. The study's objective was determine donor inbreds' relative value understand impact selection history genomic prediction. We conducted two‐stage analysis compare hybrid G E variance populations. yield significantly lower population population. reduced led increased indirect prediction accuracy (when training testing data are drawn same but different environments). cross‐validation, had greatest 45% time, followed (30%) (25%). Results demonstrate that greater longest (PHN11), contributing stability.

Language: Английский

Leveraging data from the Genomes-to-Fields Initiative to investigate genotype-by-environment interactions in maize in North America DOI Creative Commons
Marco Lopez‐Cruz, Fernando Aguate, Jacob D. Washburn

et al.

Nature Communications, Journal Year: 2023, Volume and Issue: 14(1)

Published: Oct. 30, 2023

Abstract Genotype-by-environment (G×E) interactions can significantly affect crop performance and stability. Investigating G×E requires extensive data sets with diverse cultivars tested over multiple locations years. The Genomes-to-Fields (G2F) Initiative has maize hybrids in more than 130 year-locations North America since 2014. Here, we curate expand this set by generating environmental covariates (using a model) for each of the trials. resulting includes DNA genotypes linked to 70,000 phenotypic records grain yield flowering traits 4000 hybrids. We show how valuable serve as benchmark agricultural modeling prediction, paving way countless investigations maize. use multivariate analyses characterize set’s genetic structure, study association key factors traits, provide benchmarks using genomic prediction models.

Language: Английский

Citations

14

Using machine learning to combine genetic and environmental data for maize grain yield predictions across multi-environment trials DOI Creative Commons
Igor Kuivjogi Fernandes, Caio Canella Vieira, Kaio Olímpio das Graças Dias

et al.

Theoretical and Applied Genetics, Journal Year: 2024, Volume and Issue: 137(8)

Published: July 23, 2024

Incorporating feature-engineered environmental data into machine learning-based genomic prediction models is an efficient approach to indirectly model genotype-by-environment interactions. Complementing phenotypic traits and molecular markers with high-dimensional such as climate soil information becoming a common practice in breeding programs. This study explored new ways combine non-genetic using learning. Using the multi-environment trial from Genomes To Fields initiative, different predict maize grain yield were adjusted various inputs: genetic, environmental, or combination of both, either additive (genetic-and-environmental; G+E) multiplicative (genotype-by-environment interaction; GEI) manner. When including data, mean accuracy learning increased up 7% over well-established Factor Analytic Multiplicative Mixed Model among three cross-validation scenarios evaluated. Moreover, G+E was more advantageous than GEI given superior, at least comparable, accuracy, lower usage computational memory time, flexibility accounting for interactions by construction. Our results illustrate provided ML framework, particularly feature engineering. We show that engineering stage offers viable option envirotyping generates valuable models. Furthermore, we verified may be considered tree-based approaches without explicitly model. These findings support growing interest merging genotypic predictive modeling.

Language: Английский

Citations

5

Common pitfalls in evaluating model performance and strategies for avoidance in agricultural studies DOI Creative Commons
Chunpeng James Chen, R.R. White,

Ryan Wright

et al.

Computers and Electronics in Agriculture, Journal Year: 2025, Volume and Issue: 234, P. 110126 - 110126

Published: March 6, 2025

Language: Английский

Citations

0

A fast algorithm to factorize high-dimensional tensor product matrices used in genetic models DOI Creative Commons
Marco Lopez‐Cruz, Paulino Pérez‐Rodríguez, Gustavo de los Campos

et al.

G3 Genes Genomes Genetics, Journal Year: 2024, Volume and Issue: 14(3)

Published: Jan. 5, 2024

Many genetic models (including for epistatic effects as well genetic-by-environment) involve covariance structures that are Hadamard products of lower rank matrices. Implementing these requires factorizing large product The available algorithms factorization do not scale big data, making the use some feasible with sample sizes. Here, based on properties and (related) Kronecker products, we propose an algorithm produces approximate decomposition is orders magnitude faster than standard eigenvalue decomposition. In this article, describe algorithm, show how it can be used to factorize matrices, present benchmarks, illustrate method by presenting analysis data from northern testing locations G × E project Genomes Fields Initiative (n ∼ 60,000). We implemented proposed in open-source "tensorEVD" R package.

Language: Английский

Citations

3

A Method to Estimate Climate Drivers of Maize Yield Predictability Leveraging Genetic-by-Environment Interactions in the US and Canada DOI Creative Commons
Parisa Sarzaeim, Francisco Muñoz‐Arriola

Agronomy, Journal Year: 2024, Volume and Issue: 14(4), P. 733 - 733

Published: April 2, 2024

Throughout history, the pursuit of diagnosing and predicting crop yields has evidenced genetics, environment, management practices intertwined in achieving food security. However, sensitivity phenotypes genetic responses to climate still hampers identification underlying abilities plants adapt change. We hypothesize that PiAnosi WagNer (PAWN) global analysis (GSA) coupled with a by environment (GxE) model built environmental covariance markers structures, can evidence contributions on predictability maize U.S. Ontario, Canada. The GSA-GxE framework estimates relative contribution variables improving yield predictions. Using an enhanced version Genomes Fields initiative database, shows spatially aggregated is attributed solar radiation, followed temperature, rainfall, humidity. In one-third individually assessed locations, rainfall was primary responsible for predictability. Also, consistent pattern top sensitivities (Relative Humidity, Solar Radiation, Temperature) as main or second most relevant drivers shed some light improvement response

Language: Английский

Citations

3

Global Genotype by Environment Prediction Competition Reveals That Diverse Modeling Strategies Can Deliver Satisfactory Maize Yield Estimates DOI Creative Commons
Jacob D. Washburn, José Ignacio Varela, Alencar Xavier

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Sept. 20, 2024

Abstract Predicting phenotypes from a combination of genetic and environmental factors is grand challenge modern biology. Slight improvements in this area have the potential to save lives, improve food fuel security, permit better care planet, create other positive outcomes. In 2022 2023 first open-to-the-public Genomes Fields (G2F) initiative Genotype by Environment (GxE) prediction competition was held using large dataset including genomic variation, phenotype weather measurements field management notes, gathered project over nine years. The attracted registrants around world with representation academic, government, industry, non-profit institutions as well unaffiliated. These participants came diverse disciplines include plant science, animal breeding, statistics, computational biology others. Some had no formal genetics or plant-related training, some were just beginning their graduate education. teams applied varied methods strategies, providing wealth modeling knowledge based on common dataset. winner’s strategy involved two models combining machine learning traditional breeding tools: one model emphasized environment features extracted Random Forest, Ridge Regression Least-squares, focused genetics. Other high-performing teams’ included quantitative genetics, classical learning/deep learning, mechanistic models, ensembles. used, such genetics; weather; data, also diverse, demonstrating that single far superior all others within context competition.

Language: Английский

Citations

1

Global Genotype by Environment Prediction Competition Reveals That Diverse Modeling Strategies Can Deliver Satisfactory Maize Yield Estimates DOI Creative Commons
Jacob D. Washburn, José Ignacio Varela, Alencar Xavier

et al.

Genetics, Journal Year: 2024, Volume and Issue: unknown

Published: Nov. 22, 2024

Abstract Predicting phenotypes from a combination of genetic and environmental factors is grand challenge modern biology. Slight improvements in this area have the potential to save lives, improve food fuel security, permit better care planet, create other positive outcomes. In 2022 2023, first open-to-the-public Genomes Fields initiative Genotype by Environment prediction competition was held using large dataset including genomic variation, phenotype weather measurements, field management notes gathered project over 9 years. The attracted registrants around world with representation academic, government, industry, nonprofit institutions as well unaffiliated. These participants came diverse disciplines, plant science, animal breeding, statistics, computational biology, others. Some had no formal genetics or plant-related training, some were just beginning their graduate education. teams applied varied methods strategies, providing wealth modeling knowledge based on common dataset. winner's strategy involved 2 models combining machine learning traditional breeding tools: 1 model emphasized environment features extracted random forest, ridge regression, least squares, focused genetics. Other high-performing teams’ included quantitative genetics, learning/deep learning, mechanistic models, ensembles. used, such weather, data, also diverse, demonstrating that single far superior all others within context competition.

Language: Английский

Citations

1

Using machine learning to integrate genetic and environmental data to model genotype-by-environment interactions DOI Creative Commons
Igor Kuivjogi Fernandes, Caio Canella Vieira, Kaio Olímpio das Graças Dias

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Feb. 12, 2024

Abstract Complementing phenotypic traits and molecular markers with high-dimensional data such as climate soil information is becoming a common practice in breeding programs. This study explored new ways to integrate non-genetic genomic prediction models using machine learning (ML). Using the multi-environment trial from Genomes To Fields initiative, different predict maize grain yield were adjusted various inputs: genetic, environmental, or combination of both, either an additive (genetic-and-environmental; G+E) multiplicative (genotype-by-environment interaction; GEI) manner. When including environmental data, mean predictive ability increased 7-9% over well-established Factor Analytic Multiplicative Mixed Model (FA) among three cross-validation scenarios evaluated. Moreover, G+E model was more advantageous than GEI given superior, at least comparable, ability, lower usage computational memory time, flexibility accounting for interactions by construction. Our results illustrate provided ML framework, particularly feature engineering. We show that featured engineering stage offers viable option envirotyping generates valuable learning-based models. Furthermore, we verified genotype-by-environment may be considered tree-based approaches without explicitly model. These findings support growing interest merging genotypic into modeling. Key message Incorporating feature-engineered efficient approach interactions.

Language: Английский

Citations

0

Overcoming the “feast or famine” effect: improved interaction testing in genome-wide association studies DOI Creative Commons

Huanlin Zhou,

Mary Sara McPeek

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Feb. 15, 2024

In genetic association analysis of complex traits, detection interaction (either GxG or GxE) can help to elucidate the architecture and biological mechanisms underlying trait. Detection in a genome-wide study (GWIS) be methodologically challenging for various reasons, including high burden multiple comparisons when testing epistasis between all possible pairs set genomewide variants, as well heteroscedasticity effects occurring presence GxE interaction. this paper, we address problem an even more striking phenomenon that call "feast famine" effect occurs context. We show any given GWIS, type 1 error standard tests performed vary widely from nominal level, where actual GWIS varies predictable function observed trait environmental values. Using methods, some GWISs will have systematically underinflated p-values ("feast"), others overinflated ("famine"), which lead false interaction, reduced power, inconsistent results across studies, failure replicate true signal. This startling is specific it may partly explain why such has often proved difficult replicate. feast famine wide range but not limited (1) linear mixed model (LMM) using approaches t-tests/Wald tests, likelihood ratio score tests; (2) doing combined interaction-association test LMM F-tests (3) with environments SNPs, these are modeled random approaches; (4) performing significance assessed permutation residuals. theoretically key cause variables conditioned on analysis, suggests approach correct by changing way conditioning done. insight, developed TINGA method adjust statistics make their closer uniform under null hypothesis. simulations both controls improves power. allows covariates population structure through use accounts heteroscedasticity. apply flowering time Arabidopsis thaliana.

Language: Английский

Citations

0

MegaLMM improves genomic predictions in new environments using environmental covariates DOI Creative Commons
Haixiao Hu, Renaud Rincent, Daniel E. Runcie

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: March 11, 2024

ABSTRACT Multi-environment trials (METs) are crucial for identifying varieties that perform well across a target population of environments (TPE). However, METs typically too small to sufficiently represent all relevant environment-types, and face challenges from changing environment-types due climate change. Statistical methods enable prediction variety performance new beyond the needed. We recently developed MegaLMM, statistical model can leverage hundreds significantly improve genetic value accuracy within METs. Here, we extend MegaLMM genomic in by learning regressions latent factor loadings on Environmental Covariates (ECs) trials. evaluated extended using maize Genome-To-Fields dataset, consisting 4402 cultivated 195 with 87.1% phenotypic values missing, demonstrated its high under various breeding scenarios. Furthermore, showcased MegaLMM’s superiority over univariate GBLUP predicting trait experimental genotypes environments. Finally, explored use higher-dimensional quantitative ECs discussed when how detailed environmental data be leveraged propose applied plant diverse crops different fields genetics where large-scale linear mixed models utilized.

Language: Английский

Citations

0