u/Holodoxa

Modeling rare coding variation on chromosome X provides insight into the genetics and differential sex prevalence of autism spectrum disorder

Modeling rare coding variation on chromosome X provides insight into the genetics and differential sex prevalence of autism spectrum disorder

Abstract

Autism spectrum disorder (ASD) is estimated to be up to four times as common in males as in females, yet the causes of this prevalence difference are not well established. One possible driver is genetic variation on the X chromosome, as it contains genes capable of contributing to ASD (e.g., PTCHD1, MECP2) and is known to play a role in genetic disorders with differential sex prevalence (e.g., color blindness). However, a lack of power compared to the autosomes combined with the complexities of modeling its biology have led to the X being largely overlooked in sequencing studies. Here, we develop quantitative X-linked TADA, a new model designed specifically for application to this chromosome, and use it to analyze rare variation from 50,663 individuals with ASD (and 136,670 individuals total). We find 9 genes on the X associated with ASD at a false discovery rate (FDR) < 0.05 and an additional 9 genes at FDR < 0.2, with many of these previously identified as involved in specific neurodevelopmental disorders. Point estimates of the liability conferred by de novo variants on the X are similar in females and males, with both sexes’ estimates elevated >20% above the corresponding autosomal values. We also develop a general theory of how X-linked variation of any additive or non-additive effect influences liability and describe its implications for prevalence. Using this theory and our empirical results, we show how genetic variation on the X could contribute to the sex-differential prevalence of ASD.

medrxiv.org
u/Holodoxa — 1 day ago

Rapid adaptive increase of amylase gene copy number in Indigenous Andeans

Abstract

The salivary amylase gene AMY1 exhibits remarkable copy number variation linked to dietary shifts in human evolution. While global studies highlight its structural complexity and association with starch-rich diets, localized selection patterns remain underexplored. Here, we analyze AMY1 copy number in 3,723 individuals from 85 populations, revealing that Indigenous Peruvian Andean populations possess the highest AMY1 copy number globally. A genome-wide analysis shows significantly higher amylase copy numbers in Peruvian Andean genomes compared to closely related populations. Further, we identify positive selection (selection coefficient of 0.0124, log likelihood ratio of 11.1543) at the nucleotide level on a haplotype harboring at least five haploid AMY1 copies, with a Peruvian Andean-specific expansion dated to around 10,000 years ago, coinciding with potato domestication in the region. Using ultra-long-read sequencing, we demonstrate that previously described recombination-based mutational mechanisms drive the formation of high-copy AMY1 haplotypes observed in Andean population. Our study provides a framework for investigating structurally complex loci and their role in human dietary adaptation.

nature.com
u/Holodoxa — 1 day ago

Signatures of pathogen-driven selection and Austronesian gene flow of Papua New Guinea HLA alleles

Summary

Human leukocyte antigen (HLA) class I and II are cell surface proteins that display peptide antigens to immune cells, thereby mediating detection of infected cells and production of antibodies. Pathogen exposure and demographic events, including local adaptation and admixture, have driven and maintained exceptional polymorphism of HLA genes across human populations. Papua New Guinea has a complex demography, with geographically distinct populations in the highlands and lowlands and exceptional linguistic heterogeneity throughout the island. The lowland populations retain signatures of Austronesian expansion ∼3,000 years ago. Papua New Guinea populations are also differentially exposed to endemic malarial pathogens, with a greater burden in the lowlands. We analyzed genome-wide autosomal SNP data together with HLA allele sequences, linguistic, and geographical data from 337 Papuans. We find the substructure of HLA alleles to be highly correlated with altitude in Papua New Guinea, a signal that is distinct from the rest of the genome. In addition, specific HLA-B and HLA-DP alleles in lowland groups have a greater number of homozygous genotypes than expected under neutrality. Some of these HLA alleles are of Austronesian genetic ancestry. We find that the HLA-binding repertoires at candidate loci are significantly enriched for antigenic P. falciparum-derived peptides. Together, these results indicate that pathogen-driven selective pressures correlate with the observed HLA genetic substructure in Papua New Guinea, highlighting the critical importance of characterizing highly complex HLA variation in understanding differences in disease susceptibility across diverse human groups.

cell.com
u/Holodoxa — 1 day ago

David Reich – Why the Bronze Age was an inflection point in human evolution

Great listen. Coverage of recent Eurasian aDNA selection paper (Akbari et al. Nature) by David Reich. Includes Reich's recent hypothesis paper to account for AMH-Neanderthal-Denisovan relationships.

dwarkesh.com
u/Holodoxa — 1 day ago

Summary

Understanding the genetic regulation of circulating protein levels can provide new insights into disease mechanisms. Here, we present the largest proteogenomic study to date (n = 78,664 participants across 38 studies), identifying >24,000 protein quantitative trait loci (QTLs) associated with 1,116 proteins, acting near to (n = 5,040) or distant (n = 19,698) from the cognate gene. Using machine learning-guided effector gene assignment, we provide genetic evidence for pathways, cell types, and tissues that modulate circulating protein levels, highlighting N-linked glycosylation as an important regulatory pathway. We demonstrate that genetic instruments of protein production/function (“cis”) versus modulation (“trans”) reveal distinct phenotypic insights. We identify proteins as candidates for drug targets and engagement (e.g., plasma furin and cardiovascular diseases) by comparing cis-based genetic evidence with protein-disease associations. Systematic triangulation of trans-protein QTLs (pQTLs) with genetic and protein associations across many diseases highlights potential drug repurposing opportunities, e.g., tyrosine kinase 2 (TYK2) inhibitors for rheumatoid arthritis. Our multi-cohort meta-analyses generate proteogenomic insights into disease mechanisms and new treatment opportunities.

cell.com
u/Holodoxa — 6 days ago

Abstract

The interpretation of polygenic scores (PGS) for general cognitive ability (GCA) remains contested, with concerns about indirect genetic effects, environmental confounding, cross-ancestry portability, and the gap between PGS prediction and twin heritability estimates. Relying on a newly constructed PGS using within-family designs in two independent sibling cohorts (UK Biobank, N=4,642 pairs; ABCD, N=736 pairs), we demonstrate that direct genetic effects account for the large majority of PGS prediction (within-family attenuation). Correcting for measurement error in brief cognitive assessments, the within-family association with latent general ability is approximately 0.45, substantially higher than observed-scale estimates. Cross-ancestry portability follows theoretical expectations (66% effect retention in African Americans). Within families, higher PGS predicts greater educational attainment, occupational status, and reduced cardiometabolic disease risk, with no evidence for gene-environment interactions or substantial adverse pleiotropy. These findings replicate using a benchmark predictor based on publicly available data, confirming they reflect properties of cognitive genetic architecture rather than idiosyncrasies of a particular score.

u/Holodoxa — 11 days ago

Abstract

During the European Neolithic transition, migrating Anatolian farmers admixed with local hunter-gatherers, coinciding with major shifts in diet, environment, and lifestyle that imposed strong selective pressures. Local ancestry inference is widely used to detect selection following admixture, but most methods were developed and validated on present-day populations. Their performance in ancient DNA, where reference panels are smaller, data sparser, and admixture more ancient, remains unresolved. We benchmark six local ancestry inference methods on 176 imputed Neolithic genomes, comparing ancestry proportions, tract length distributions, and selection signatures. While individual-level ancestry estimates are highly correlated across methods, inferred tract lengths and admixture time estimates vary by over an order of magnitude. Integrating results across methods and replicating across methods and in two independent datasets (n=378 and 1,121) identifies robust ancestry deviations at SLC24A5 and FADS1/2, consistent with adaptation on pigmentation and metabolism, respectively. We also identify PER3 (circadian rhythm) and IRAK4 (innate immunity) as candidate loci, but with less consistent signals across methods. Finally, we replicate previous reports of excess hunter-gatherer ancestry at the HLA, but these results are inconsistent across methods and suggest that they may be affected by bias in local ancestry inference. Our findings demonstrate that while local ancestry inference recovers biologically meaningful signals in ancient genomes, results can be sensitive to the methods used for inference, particularly in complex regions like the HLA. Method choice critically influences inferred ancestry patterns and selection signals, underscoring the importance of multi-method validation.

biorxiv.org
u/Holodoxa — 13 days ago
▲ 4 r/heredity+1 crossposts

Highlights

•Recurrent NOTCH2NL duplications occurred in great apes, including humans (∼3 mya)

•All tested human haplotypes have a NOTCH2NLA gene

NOTCH2NLB and NOTCH2NLR/C are variably present due to gene conversion and deletion

•Paralog-specific accessible elements are candidate drivers of NOTCH2NL expression

Summary

NOTCH2NL (NOTCH2-N-terminus-like) genes arose from ape-specific chromosome 1 segmental duplications implicated in human brain cortical expansion, including an incomplete NOTCH2 gene. Genetic characterization of these loci and their regulation is complicated because they are embedded in large, nearly identical duplications that predispose to recurrent microdeletion syndromes. Using near-complete long-read assemblies generated from 70 human and 12 ape haploid genomes, we show independent recurrent duplication among apes with protein-coding copies emerging in humans 2.2–3.7 million years ago. We distinguish NOTCH2NL paralogs present in every human haplotype (NOTCH2NLA) from copy-number-variable ones. We also characterize large-scale structural variation, including gene conversion, for 28% of haplotypes, leading to a previously undescribed paralog, NOTCH2tv. Finally, we apply Fiber-seq and long-read transcript sequencing to human dorsal forebrain organoids to characterize the regulatory landscape and find that the most fixed paralogs, NOTCH2 and NOTCH2NLA, harbor the greatest number of paralog-specific elements potentially driving their regulation.

cell.com
u/Holodoxa — 15 days ago

Abstract

Ancient DNA (aDNA) has revolutionized our ability to study human evolution by enabling the direct observation of genetic changes through time. This has reshaped our understanding of human adaptation and its relevance for modern health and disease. In recent years, high-quality ancient genomes and large datasets have made it possible to track allele frequency dynamics and identify episodes of natural selection with unprecedented resolution. Here, we synthesize insights from recent studies that have systematically investigated how humans adapted to shifts in diet, mobility, pathogen exposure and environment. We summarize the approaches used to detect selection in aDNA, examine the role of major migration and admixture events and connect results across time periods and archaeological contexts. Finally, we outline future challenges and opportunities that need to be addressed for aDNA studies to provide new insights into human adaptation that could not be inferred from present-day genomes alone.

u/Holodoxa — 15 days ago

Abstract

Indigenous peoples of America represent the last principal expansion of humans across the globe^(1), yet their genetic history remains one of the least explored^(2). Although these populations have inhabited the continent for thousands of years^(3), their evolutionary history remains largely unresolved^(4)^(,)^(5), owing to the limited availability of genomic data. Here we present data on 128 high-coverage Indigenous American genomes and show they harbour extensive and previously uncharacterized genetic diversity, reflecting at least three dispersals into South America, followed by regional differentiation and long-term continuity. We identified widespread natural selection signals in genes associated with immunity, metabolism, reproduction and development, which were shaped by adaptation to diverse environmental conditions. Notably, several genomic regions exhibit a remarkable allele sharing with Australasian populations, probably originating from an ancient admixture event and partly maintained by selection for more than 10,000 years. We also detected distinct contributions from archaic humans with adaptive introgression affecting key biological functions. The limited overlap between the regions of Australasian affinity and archaic ancestry indicates independent evolutionary origins of these signals. These findings challenge simplified models of continental settlements and show a more dynamic and complex evolutionary history for the Indigenous peoples in America.

u/Holodoxa — 20 days ago

Abstract

Background: Serum creatine kinase (CK) is a routinely measured biomarker of muscle damage, yet the genetic factors underlying inter-individual variation in CK levels remain poorly defined.

Methods: Here we present a large multi-ancestry genome-wide association meta-analysis of serum CK, comprising 237,255 participants spanning Admixed American, African American, East Asian, European and Middle Eastern populations.

Findings: We identify 107 independent loci at genome-wide significance (P<5x10-8), 98 of which are previously unreported, with pronounced enrichment for genes expressed in skeletal and cardiac muscle and overlap with pathways related to muscle structure and function. Notably, eight loci map to genes implicated in Mendelian myopathies, underscoring a continuum from common regulatory variation to rare pathogenic mutations. Integrative quantitative trait locus (QTL)-based Mendelian randomization and colocalization implicate several genes in CK regulation, most prominently SMAD3, KLF5 and STAT3 within the transforming growth factor beta signalling pathway. CK levels show positive genetic correlations with traits reflecting tissue damage as well as muscle mass and strength, and negative correlations with C-reactive protein, indicating pleiotropic effects from muscle biology and enzyme clearance.

Interpretation: These findings delineate the genetic architecture of serum CK across diverse populations and highlight muscle-related pathways contributing to CK variation.

u/Holodoxa — 23 days ago

Highlights

  • Pathogenic variants may yield markedly different phenotypes, even when they occur within the same neurodevelopmental disorder risk gene.
  • The phenotypic heterogeneity can be partially explained by mutation type, affected protein domain, and disrupted transcript isoform.
  • Disease penetrance and expressivity in neurodevelopmental disorders are modified by background genetic variation, including oligogenic and polygenic effects.
  • Prenatal and early-life exposures, such as maternal immune activation, valproic acid exposure, and early-life stress, can interact with pathogenic variants to shape phenotypic outcomes.
  • Stochastic variability during neurodevelopment contributes to phenotypic heterogeneity and may alter the impact of pathogenic variants.
cell.com
u/Holodoxa — 26 days ago

Abstract

Recent work by Bocher et al. used Mendelian randomization to identify hundreds of gene expressions and multiple proteins that likely play a causal role in type 2 diabetes. The study emphasized the importance of ancestry and tissue-specific analyses in elucidating intervention targets.

cell.com
u/Holodoxa — 26 days ago

Abstract

Ancient DNA-based studies of natural selection have focused on West Eurasia due to the availability of large sample sizes, but rich insights are expected to come from comparative studies that can reveal which patterns are shared and which region-specific. We test around seven million variants for selection in 1,862 ancient East Eurasians (867 with new data) distributed over the last ten millennia. Using a generalized linear mixed model to control for population structure, we identify 40 genome-wide significant signals of selection, which have a particularly strong impact on immune and cardiometabolic traits just as in West Eurasia. East and West Eurasia show highly correlated signals of adaptation both for individual alleles and for complex traits, showing how these geographically separate groups experienced convergent evolution in response to parallel transitions to food producing economies and the accompanying lifestyle changes. An exception is the genetic determinants of light skin color: West Eurasians depigmented in the last 10,000 years, but most skin lightening in East Asians arose prior to the Holocene.

biorxiv.org
u/Holodoxa — 27 days ago

Abstract

The genetic architecture of complex traits spans a continuum of polygenicity, yet it remains unclear how differences in polygenicity relate to the functional localization of SNP heritability across the genome. We use a MiXeR-based framework to partition heritability across exonic, intronic, and intergenic regions for 34 traits and introduce a likelihood-based annotation contribution score that quantifies annotation-specific impact on heritability. Exons explain a minority of heritability, and their contribution decreases with increasing polygenicity, from an average of 22% in less polygenic somatic diseases and biomarkers to 13% in highly polygenic psychiatric and cognitive phenotypes. Intergenic fractions show the opposite trend, whereas intronic fractions remain relatively stable. Analysis of a broader set of functional annotations reveals systematic differences along the polygenicity axis: highly polygenic traits show stronger contributions from comparative genomics and variant-effect scores, whereas less polygenic traits show stronger contributions in promoter, transcription, and chromatin annotations. Together, these results indicate that the functional partitioning of heritability systematically varies with polygenicity, pointing to a shift from gene-proximal regulatory architectures to architectures shaped by numerous dispersed regulatory effects as a key determinant of differences in polygenicity across traits.

biorxiv.org
u/Holodoxa — 27 days ago

Abstract

Ancient DNA has transformed our understanding of population history^(1), but its potential to reveal as much about human evolutionary biology has not been realized because of limited sample sizes and the difficulty of distinguishing sustained rises in allele frequency increasing fitness—directional selection—from shifts due to migrations, population structure, or non-adaptive purifying or stabilizing selection^(2)^(,)^(3)^(,)^(4)^(,)^(5)^(,)^(6)^(,)^(7). Here we present a method for detecting directional selection in ancient DNA time-series data that tests for consistent trends in allele frequency change over time, and apply it to 15,836 West Eurasians (10,016 with new data). Previous work has shown that classic hard sweeps driving advantageous mutations to fixation have been rare over the broad span of human evolution^(8)^(,)^(9). By contrast, in the past ten millennia, we find that many hundreds of alleles have been affected by strong directional selection. We also document one-standard-deviation changes on the scale of modern variation in combinations of alleles that today predict complex traits. This includes decreases in predicted body fat and schizophrenia, and increases in measures of cognitive performance. These effects were measured in industrialized societies, and it remains unclear how these relate to phenotypes that were adaptive in the past. We estimate selection coefficients at 9.7 million variants, enabling study of how Darwinian forces couple to allelic effects and shape the genetic architecture of complex traits.

u/Holodoxa — 28 days ago