The Need for Diversity
in Alzheimer's Disease Research
Background and motivation — defining the scope of this analysis
Defining Alzheimer's Disease
Key message: AD is both extremely common and deeply complex — and that complexity is the reason why studying it in a single, narrow population is not enough.
Alzheimer's disease (AD) is the most common cause of dementia, accounting for 60–70% of cases worldwide. It is a progressive neurodegenerative disorder defined by two core hallmarks visible post-mortem and, increasingly, in vivo: extracellular deposits of amyloid-β (Aβ) protein forming plaques, and intracellular tangles of hyperphosphorylated tau protein. These drive widespread synaptic loss, neuroinflammation, and cortical atrophy, producing the gradual, irreversible decline in memory, language, and reasoning that defines the clinical syndrome.
The dominant theoretical framework, the amyloid cascade hypothesis, holds that Aβ42 oligomers are the initiating event, triggering downstream tau pathology and neurodegeneration. Yet this model is incomplete: neuroinflammation, vascular damage, metabolic dysfunction, and genetic risk all act independently and in concert. This is not merely an academic point. It explains why the first approved anti-amyloid therapies (lecanemab, donanemab) show only modest clinical benefit: they reduce amyloid but cannot reverse the neurodegeneration that has already accumulated, often years before symptoms appear.
At the population level, an estimated 55 million people worldwide live with dementia, a figure projected to exceed 150 million by 2050. More than 95% of cases are sporadic late-onset disease, where age is the primary risk factor. The strongest common genetic risk factor is the APOE ε4 allele, which increases AD risk 3-fold in heterozygous and 8–12-fold in homozygous carriers, but crucially, its effect size varies substantially across ancestral populations, a finding that already signals the importance of diversity in genetic research.
A Highly Heterogeneous Disease
Key message: Nearly half of all dementia cases worldwide could be prevented, but only if we study the right risk factors in the right populations. Current evidence is systematically biased toward high-income, European-descent groups, leaving most of the world's at-risk population poorly understood.
Alzheimer's disease is not one disease. Its clinical presentation, neuropathological severity, genetic architecture, and rate of progression vary enormously across individuals and populations. This heterogeneity matters practically: a prevention or treatment strategy calibrated to one population may be entirely ineffective in another if the underlying biological drivers differ.
The Lancet Commissions on Dementia Prevention, Intervention and Care have provided the most rigorous estimates of this preventable burden. The 2020 report identified 12 modifiable risk factors, cumulatively accounting for approximately 40% of global dementia cases (Livingston et al., The Lancet, 2020; doi:10.1016/S0140-6736(20)30367-6). The 2024 update added untreated vision loss and elevated LDL cholesterol, revising the preventable fraction upward to approximately 45% (Livingston et al., The Lancet, 2024; doi:10.1016/S0140-6736(24)01296-0).
These figures are both encouraging and sobering. The evidence base for these estimates derives predominantly from high-income, European-descent populations. A broader analysis incorporating risk factors more prevalent in low- and middle-income countries, and accounting for sex-related disparities, could push the preventable fraction to approximately 65%. It also reveals a striking asymmetry: despite dementia disproportionately affecting women, 57% of the 14 identified risk factors from the 2024 Commission are more prevalent in men.
New in 2024 — Two additional risk factors:

Adapted from: Livingston et al. (2024). Dementia prevention, intervention, and care: 2024 report of the Lancet standing Commission. The Lancet, 404(10452), 572–628. Bubble size represents the percentage reduction in dementia cases if that risk factor is eliminated.
Defining the Diversity that Impacts AD
Key message: "Diversity" in AD research is defined here as three interconnected things — who is studied (ethnic and geographic diversity), how biology differs by sex, and whether the research infrastructure enables the global collaboration needed to study these differences at scale. All three are currently insufficient. The issue is not that prior research is wrong — it is that findings from homogeneous cohorts have restricted applicability to most of the world.
What do we mean by "diversity"?
In this context, diversity encompasses three distinct but interrelated dimensions: ethnic and geographic diversity: studying people from different ancestral backgrounds and world regions, where genetic risk variants, environmental exposures, and healthcare access differ substantially; sex and gender: treating biological sex as a primary variable shaping AD risk and progression, not a demographic covariate to adjust away; and data and methodological diversity: ensuring that research infrastructure, data standards, and open-science practices enable equitable global collaboration. Diversity should be valued as a driver of discovery, not as a confounder to control for.

(A) Bars: projected share of global dementia burden by 2050 (%) per region. Line: current research representation (%). Sub-Saharan Africa and South Asia face the steepest burden growth yet account for only 1–2% of research. (B) Country-level representation scores — the United States (30) and UK (25) dominate; Nigeria scores 1. Adapted from: Vilor-Tejedor et al. (2026). Alzheimer's & Dementia, 22(1), e71069. doi:10.1002/alz.71069. CC BY-NC-ND 4.0.
Geographic & Ethnic Diversity — Why it matters
The same disease, studied in the same kind of people, will produce answers that only work for those people.
Despite dementia's growing global burden (57 million cases in 2021, more than 60% in low- and middle-income countries, with projections reaching 139–153 million by 2050), research has historically focused on Western, educated, industrialized, rich, and democratic (WEIRD) cohorts. The scientific cost of this homogeneity is concrete: a 2024 meta-analysis found that most dementia studies do not even report participants' ethnicity or race.
The major genome-wide association studies (GWAS) identifying key AD risk loci in genes such as BIN1, CLU, CR1, and PICALM were performed almost exclusively through European consortia (IGAP, EADB, ADGC). Allele frequencies, linkage disequilibrium patterns, and gene-environment interactions differ substantially across ancestral groups: risk variants discovered in European populations may not exist at meaningful frequency in African, Asian, or Latin American populations, and vice versa. Work on the local ancestry of the APOE locus reveals ancestry-specific nuances in ε4 risk that are entirely missed in homogeneous datasets.
Broadening scope unlocks new science. Work led by the Latin America and the Caribbean Consortium on Dementia (ReDLat) identified unique variants associated with AD and frontotemporal dementia in admixed Latin American populations, findings that would be undetectable in European cohorts. The first open multimodal neuroimaging dataset of neurodegeneration from Latin America (BrainLat: 780 participants, 5 countries) underscores how thin the evidence base remains for these populations (Prado et al., Scientific Data, 2023; doi:10.1038/s41597-023-02806-8).
Sex & Gender — Why it matters
Two-thirds of people with AD are women, yet sex-stratified analyses remain the exception, not the norm.
Women bear a disproportionate burden of Alzheimer's disease: approximately two-thirds of all people living with AD worldwide. Longevity alone does not explain this. There is mounting evidence for sex-specific biological mechanisms: oestrogen withdrawal at menopause appears to accelerate amyloid accumulation; women carrying one APOE4 allele face greater AD risk than men with the same genotype; immune responses to neuroinflammation differ between sexes; and tau propagation shows sex-specific patterns.
Despite this, most AD studies treat sex as a demographic covariate to adjust away, not a biological variable to study. The UK Biobank Whole-Genome Sequencing study (490,640 participants) explicitly noted that no sex-stratified analyses were performed (UK Biobank WGS Consortium, Nature, 2025; doi:10.1038/s41586-025-09272-9). This is symptomatic of a field-wide gap. As a result, sex-specific mechanisms remain poorly characterised, sex-tailored treatments do not exist, and prevention strategies are not optimised for the majority of those most at risk.

Left: Scale of the global dementia burden (57M cases; 61% in LMICs) and historical focus on non-diverse populations. Top right: Venn diagram — diversity across participants (ethnic/racial/sex/socioeconomic), researchers (institutional diversity, career support, geographic inclusion), and methods (open science, AI/ML, multimodal integration, community-based research) converges to enable better science. Bottom: Geographic reach of the William H. Gates Sr. Fellowship — a model for equitable, international capacity building. Adapted from: Vilor-Tejedor et al. (2026). Alzheimer's & Dementia, 22(1), e71069. doi:10.1002/alz.71069. CC BY-NC-ND 4.0.
Data Infrastructure, Open Science & Researcher Diversity — Why it matters
Even when diverse data exist, they are often impossible to combine, and the researchers who could interpret them most meaningfully are often excluded from the system.
Studying diverse populations at the scale required demands large, harmonised, multi-site datasets built on open-science infrastructure. The FAIR principles (Findable, Accessible, Interoperable, Reusable (Wilkinson et al., Scientific Data, 2016; doi:10.1038/sdata.2016.18) provide the conceptual foundation. Governance frameworks like the Five Safes (Boylan et al., Lancet Digital Health, 2024; doi:10.1016/S2589-7500(24)00028-1) and trusted research environments enable cross-national collaboration without compromising participant privacy. Open platforms such as ADDI's ADWorkbench provide free computational infrastructure to researchers in LMICs who would otherwise be structurally excluded from global science.
Crucially, the diversity argument extends beyond study participants to the researchers themselves. Scientists from underrepresented communities are better positioned to build trust with local populations, design culturally relevant studies, and identify research questions that external teams simply do not ask. Without intentional investment in researcher diversity, through fellowships, mentorship, and equitable authorship, the cycle of underrepresentation in both leadership and scientific output will continue. Patient and Public Involvement and Engagement (PPIE) provides an additional mechanism for ensuring research priorities reflect the communities most affected (Blackburn et al., Research Involvement and Engagement, 2018; doi:10.1186/s40900-018-0100-8).
Reproducibility is the third pillar. A finding is only as useful as its ability to be verified and built upon. Yet the proportion of papers in the AD literature that share code or data remains low and, until now, unmeasured in a systematic way. If methods are not transparent, the field cannot efficiently build on them, particularly for underrepresented populations where data are already scarce.
Five priorities — Vilor-Tejedor et al. (Alzheimer's & Dementia, 2026; doi:10.1002/alz.71069)