
Nature Communications, Journal Year: 2025, Volume and Issue: 16(1)
Published: March 20, 2025
Abstract Increasingly efficient methods for inferring the ancestral origin of genome regions are needed to gain insights into genetic function and history as biobanks grow in scale. Here we describe two near-linear time algorithms learn ancestry harnessing strengths a Positional Burrows-Wheeler Transform. SparsePainter is faster, sparse replacement previous model-based ‘chromosome painting’ identify recently shared haplotypes, whilst PBWTpaint uses further approximations obtain lightning-fast estimation optimized genome-wide relatedness estimation. The computational efficiency gains these tools fine-scale local inference offer possibility analyse large-scale genomic datasets using different approaches. Application UK Biobank shows that haplotypes better represent ancestries than principal components, linkage-disequilibrium identifies signals recent changes population-specific selection many associated with immune responses, suggesting avenues understanding pathogen-immune system interplay on historical timescale.
Language: Английский