Ancient Ancestry

Neolithic and Paleolithic Timeframes

The Genome Analysis by HeredityLab examines regions in your genome that allow us to decipher where your ancestors originated. Characteristics related to the ancestral informative markers in genotyped regions of your genome give a window into relation to population data contained in our system. You will receive a .pdf file with information pertaining to the ethnic groups, regions of origin, and percentages of each region as a breakdown of the ethnic group classifications. Valuable insight into your genome awaits.

Details

Refined Parameters

We have spent a great deal of time testing and refining the parameters of our models. Additional reference information on the samples used can be found within the links below.

Ancient Populations

Information pertaining to reference studies on the ancient populations used.

Read More

Archaic Hominin DNA

Scientific information on the archaic DNA referenced in our analysis.

Read More

Reference Tables

Reference tables are included towards the end of the "How It’s Done" page.

Read More

Information

Ancient DNA Breakdown: Ancient Human Populations

The history of human populations is a story of migrations, adaptations, and interactions over thousands of years. Our Ancient DNA Reference Panel explores this history by analyzing genetic lineages from two key periods:

-The Paleolithic Breakdown (Early human migrations and hunter-gatherer groups)
-The Neolithic Breakdown (The rise of farming, pastoralism, and early civilizations)


While these categories help define broad historical trends, there is significant overlap in both timeframes and populations. Many groups persisted beyond their respective eras, while others mixed, giving rise to new lineages. Since ancient DNA tends to be low coverage, the algorithmic process we use for the ancient DNA model is slightly different when compared to how the proportions are calculated for the modern DNA section. The approach is multilayered, but the core approach uses a least squares based framework to estimate admixture proportions, and residuals are calculated along with the standard error, estimation error, and p values. In this case the residuals are the difference between the observed allele frequency of the target population and the expected allele frequency computed from the admixture model. The standard error quantifies the uncertainty in the estimated mixing coefficients, with a lower standard error indicative of more precise estimates, while a higher standard error suggests more variability in the inferred mixing proportions. The estimation error in this case refers to the overall discrepancy between the true ancestry proportions and the estimated ones. Unlike standard error, which measures the uncertainty in estimates, the estimation error directly evaluates how accurate the estimates are. This is measured as the norm of residuals, also known as the L2 norm of the residuals or the sum of squared residuals (SSR), and is calculated to evaluate how well the inferred admixture proportions fit the observed genetic data, reflecting bias in the model or deviations due to factors not captured in the genetic data. It is determined using the Euclidean distance between observed and predicted allele frequencies and serves as an optimization criterion. The goal is to minimize the sum of squared residuals and to determine an appropriate model fit, with a lower SSR suggesting a better model fit.

The p-values test the statistical significance of the fit between the model and the observed genetic data. A p-value assesses whether the ancestry proportions inferred by our algorithm significantly differs from an expected null model. If the p-value is high (e.g., > 0.05), the model provides a reasonable fit to the data; if it is low, the model’s assumptions may be inaccurate.

Bootstrapping is done, and Subsets of snp blocks within the target genome are repeatedly sampled to evaluate admixture proportions using slightly different subsets of the genome each time. The spread of these estimates across several resampled datasets creates a confidence interval (95% CI).

Ensembling is used to combine predictions from various models, with some models being more accurate than others thus weighted more in the final determination. The approach ensures a computationally efficient and statistically rigorous analysis of modern individuals’ ancestry using ancient DNA as reference populations, providing a robust way to model genetic contributions from prehistoric groups while accounting for several variations in the reference data which aren’t an issue for datasets representing modern populations.

The Paleolithic Breakdown (Pre-10,000 BCE) The Paleolithic era, or Old Stone Age, is the period before farming, characterized by hunter-gatherer lifestyles and the early movements of modern humans. Fossils and genetic data suggest that modern humans first evolved in Africa over 200,000 years ago, later expanding across Eurasia, where they encountered archaic human groups like Neanderthals and Denisovans.

Key Paleolithic Populations

Dzudzuana – A genetically distinct population from West Asia, possibly ancestral to later Neolithic groups.

Ancient North Eurasian (ANE) – A group that emerged in Siberia and contributed to both European and Native American ancestry.

Ancient Northern East Asian (ANEA) – The ancestral population of many East Asians, Siberians, and some Native Americans.

Ancient Southern East Asian (ASEA) – An early lineage that contributed to modern Southeast Asian populations.

Ancient Rainforest Hunter-Gatherers (Ancient RHG) – A group adapted to rainforest environments, ancestral to the Mbuti and other Central African hunter-gatherers.

South African Hunter-Gatherers – The earliest known lineage of modern humans, ancestral to present-day San groups.

Jomon – An ancient population of Japan, distinct from later East Asian farming cultures.

Basal East Eurasian (Onge) – Represented by the indigenous Onge people of the Andaman Islands, one of the earliest diverging East Asian-related groups.



Reconstructed Paleolithic Populations

For populations where ancient DNA samples are unavailable, we have created reconstructed genetic profiles:

Ancient East African – Derived from Dinka, East African Pastoral Neolithic, and Mota samples, representing early East African hunter-gatherers.

Ancient North African – Modeled by removing Dzudzuana-related ancestry from Iberomaurusian samples, capturing an earlier layer of North African ancestry.

Ancient West African – Constructed using Yoruba, Mende, and Ancient Shum Laka samples, reflecting early West African diversity.



The Neolithic Breakdown (10,000 BCE – 2,000 BCE)

The Neolithic era marks the transition to agriculture, animal domestication, and the development of more complex societies. Farming first emerged in Southwest Asia (~11,000 years ago) and spread into Europe, Africa, and Asia, leading to significant population movements and genetic mixing.

Key Neolithic Populations

Amur River Hunter-Gatherers – A group from the Russian Far East, ancestral to some East Asian and Siberian populations.

Anatolia Neolithic Farmers – The early agriculturalists of Anatolia, who later spread farming into Europe.

Ancient Australians – Representing the first inhabitants of Australia, with deep genetic ties to Papua New Guinea and Melanesia.

Baikal Hunter-Gatherers – A group from Siberia’s Lake Baikal region, contributing to Northeast Asian and Native American ancestry.

Caucasus Hunter-Gatherers (CHG) – A population from the Caucasus Mountains that contributed to both early European farmers and steppe pastoralists.

East African Hunter-Gatherers (Mota) – The first known ancient African genome, providing insight into early East African populations.

East African Pastoral Neolithic – A group associated with the transition to herding in East Africa.

Eastern European Hunter-Gatherers (EHG) – A population that contributed to both European and Central Asian genetic makeup. 

Hoabinhian – Southeast Asian hunter-gatherers who persisted into the early Neolithic.

Iberomaurusian – An early North African population with genetic links to both Europe and Sub-Saharan Africa.Jomon – Neolithic-era inhabitants of Japan with deep East Asian and Siberian connections.

Ancient Melanesians – The ancestors of modern Melanesian populations, with connections to Denisovans.Natufian Hunter-Gatherers – One of the earliest farming-associated populations in the Levant, bridging the gap between hunter-gatherers and agriculturalists.

North American Hunter-Gatherers – The first peoples of North America, including Clovis-associated individuals.

South American Hunter-Gatherers – The first peoples of South America, with complex ancestry tracing back to Ancient North Eurasians and East Asians.

Southeast Asian Neolithic Farmers – Early rice farmers who contributed to the genetic makeup of modern Southeast Asians.

Western Hunter-Gatherers (WHG) – A key European group that mixed with early Neolithic farmers.

Yakutia Hunter-Gatherers – Early Siberian populations that contributed to modern Indigenous Siberians and Native Americans.

Yellow River Neolithic Farmers – A major East Asian farming population that played a central role in Chinese ancestry.

Zagros Neolithic Farmers – Early farmers from Iran, who contributed to South Asian and Middle Eastern genetic lineages.

Reconstructed Neolithic Populations

For groups where direct ancient DNA samples are unavailable, we have created genetic reconstructions:

Ancient Ancestral South Indian (AASI) – Modeled using Paniya and Irula samples, representing pre-Indo-Aryan South Asian populations.

Ancient Rainforest Hunter-Gatherers (Ancient RHG) – Derived from Mbuti and the RHG associated ancestry within the ancient Shum Laka samples, representing early Central African foragers.

Proto West African – Constructed by removing the Iberomaurusian associated ancestry from Yoruba, Mende, and other West African samples in order to better reflect early West African diversity.

Ancient Nilotic – Reconstructed using Dinka samples, representing early Nilotic-speaking populations.



Final Thoughts

The Paleolithic era laid the genetic foundation of modern humans, while the Neolithic era introduced large-scale migrations, agriculture, and cultural shifts that shaped today’s populations. While we have direct genetic evidence for many ancient groups, some lineages remain unsequenced—requiring careful reconstruction using modern and ancient reference populations. These reconstructions are approximations rather than exact replicas. They were generated using methods similar to those outlined in the following articles. A variety of techniques were employed to create synthetic genomes and statistical genome representations, with some sourced from the linked studies.

https://www.biorxiv.org/content/10.1101/2024.10.28.620648v1

https://doi.org/10.1093/molbev/msz037

https://pubmed.ncbi.nlm.nih.gov/39826116/

https://www.biorxiv.org/content/10.1101/2024.05.06.592724v1

https://pubmed.ncbi.nlm.nih.gov/33585906/

Below is a table detailing the ancient samples used for the analysis along with their references. Towards the end is a list of modern samples which have been used to reconstruct ancient samples which we do not have actual samples available for. It has been determined that these are partially descended from the ancient samples they have been used to represent.

Ancient Populations Table: Extended Details