DNA testing has now become an integral part of family history research.
For genealogy purposes it’s the list of cousin matches that is the most important part of the test, but all of the companies also provide a biogeographical ancestry report where they try to assign our DNA to different countries or regions of the world. These reports are becoming increasingly granular in nature. They can sometimes help to inform your research, and will occasionally lead to surprise discoveries.
Read the full version of this article and much more expert family history advice in Who Do You Think You Are? Magazine May 2020
How do companies calculate your ethnicity?
Each company tries to put together a panel of reference populations. These are modern individuals who are deemed to be representative of a particular country, region or population. The samples are collected either from publicly available research projects or from the companies’ own databases.
The research samples are taken from projects such as the 1000 Genomes Project, the Human Genome Diversity Project, the Simons Genome Diversity Project, the Asian Diversity Project and the People of the British Isles Project. Samples are generally collected from people with deep roots in a particular location. For some projects the requirement is that the individual should have four grandparents from the same country, region or county.
The next stage is to analyse the reference samples and to put them into a predefined number of genetic clusters. Outliers are removed. The clusters are then given names based on the present-day countries or regions they represent. The clusters will inevitably have some overlap because our DNA is not confined to modern political boundaries, so a French cluster might also include people from other countries in North-West Europe.
Each individual customer’s DNA is then compared with the reference populations, and you are given percentages based on your closest matching populations. You can of course only be matched to populations represented in the company database. For example, if you are from Denmark and the company has no Danish samples, you will be matched to the next closest population.
The companies sell most of their tests in Europe, North America and Australasia, and it is these populations that are best represented in the databases. It is now possible to get quite granular results for people of European ancestry, but the results for people of African or Asian heritage are much less detailed and can often be disappointing. Academic research has also focused on Europe, although this is slowly starting to change.
The testing companies started out by comparing individual markers that are mainly found in specific populations. These are known as ‘ancestry informative markers’. Living DNA and AncestryDNA have adopted a different approach and use ‘haplotypes’ – markers that are linked together. This approach is more representative of recent ancestry within the past 500 years or so. Each company uses different algorithms and compares your results to different reference populations, so results vary from company to company.
Which DNA test should you take? Find out with our guide
The companies are also constantly striving to improve their results, so your results with any one company will change over time. For example, at AncestryDNA I came out as 21 per cent Great Britain when there were just 3,000 people in the reference panel. But when the panel increased to 16,000 my results changed and I was 94 per cent England and Wales. With its most recent update Ancestry now has 40,000 reference samples and I became 80 per cent England, Wales and North-West Europe.
Both MyHeritage and FamilyTreeDNA are working on updating their ancestry estimates, and we can expect to see these rolled out later this year. FamilyTreeDNA provided a preview of myOrigins 3.0 at the family history show RootsTech in Salt Lake City, Utah, in February. The number of its reference populations will increase from 24 to 90. The number of populations in Africa will increase from four to about 21, and there will be four distinct Jewish populations.
What do the results mean?
In general, the percentages are most accurate at the continental level. Within Europe the companies can broadly separate North-Western European, Southern European and Eastern European ancestry, even if the countries assigned within these regions are not correct.
The ancestry proportions with the largest percentages tend to be the most reliable. Small percentages under 1 per cent are often nothing more than noise, and percentages under about 20 per cent will not necessarily provide a true reflection of your recent ancestry.
If you find that a company gives you 8 per cent Iberian or 9 per cent Italian and you have no documented ancestry from these locations, you shouldn’t start looking for Spanish or Italian ancestors in your family tree. You’ll probably find that these percentages disappear the next time the company updates its product. Results are more likely to be reliable if the admixture is found consistently in the results from a number of different companies. The most reliable indicator of your ancestry is not your ethnicity estimate but the names and family trees of your matches. If you have Spanish or Italian ancestry you would expect to have matches with people with Spanish or Italian surnames.
Siblings inherit different ancestral proportions from their grandparents, their great grandparents and their more distant ancestors, so we would expect the results to differ. Some siblings will therefore have ancestral components that are simply not seen in other siblings. Some of the companies provide confidence levels or ranges, so do check these out when examining your results. The range for your unexpected Swedish component at AncestryDNA might be anywhere between 0 per cent and 8 per cent. Your mystery French and German component at 23andMe that appears at the 50 per cent confidence level will probably disappear at the 90 per cent confidence level.
What are regional breakdowns?
A regional breakdown of British and European ancestry from Living DNA
In addition to providing a broad country or regional assignment, some companies are now able to assign your DNA to subregions within a particular country. AncestryDNA pioneered this approach with its Genetic Communities feature, which launched in March 2017. The communities or regions are made up of genetic networks of people sharing large chunks of DNA representing shared ancestry within the past 200 years. The networks are generated from the genetic data. The regional names are assigned to the networks based on the dominant locations of the family trees in each network.
Ancestry has more than 1,000 different regions, although most of them are in Europe and the New World. There are about 100 regions for Ireland alone, along with over 70 in the UK. In North America and Australia there are hundreds of regions representing the settler communities in the New World, with names such as ‘Mountain West Mormon Pioneers’ and ‘Australia, Queensland British Settlers’.
The regional assignments are generally fairly accurate, so if you are assigned to a region in North Tipperary, Ireland, or Fife and Angus in Scotland, it is likely that you will have some recent ancestry from this area. The regions can therefore provide important clues, particularly for people with unknown parentage.
23andMe provides a report on your recent ancestor locations that lists 10 regions where your ancestors might have lived in the past 200 years. You are matched with networks of individuals who share chunks of DNA from the same place. A map is shown highlighting the places where you have the strongest evidence of ancestry. 23andMe uses aggregated ancestral origins data from 400,000 customers to determine these locations.
If you have UK ancestry 23andMe will assign you to one of the 165 present-day administrative regions or counties such as Greater London or City of Bristol. The current regional assignments from 23andMe seem to be more a reflection of movement during the Industrial Revolution rather than origins in rural counties in the early 1800s. However, the results can be expected to improve as more people join the database.
MyHeritage is working on a similar feature known as genetic groups that was previewed at its users conference in Amsterdam in September. MyHeritage has a large customer base in continental Europe, Finland and Scandinavia, so we can look forward to good sub-regional resolution in these locations.
Living DNA has specialised in providing fine-scale regional breakdowns in the UK. Rather than assign you to communities or subregions, it incorporates the UK subregions in its main ancestry report. A combination of clustering and chromosome painting assigns people to subregions. If more than 20 per cent of your ancestry is assigned to Great Britain, you are compared to the 21 subregions with the results displayed on a map. Living DNA takes its UK reference data from the People of the British Isles Project, which collected samples from individuals who have four grandparents born in the same rural county.
What does ethnicity mean?
Debbie’s Ancestry DNA ‘Ethnicity Estimate’
Ethnicity is a reflection of shared ancestry based on social and cultural practices. Ethnic groups may be linked by a religious affiliation, a shared linguistic heritage or a common geographical origin.
Ethnicity cannot be detected by DNA, but there is sometimes an overlap with a person’s genetic ancestry. For example, people who share the same heritage will often live in the same places and marry people from similar backgrounds.
Ethnicity was historically used as a synonym for race, but the meanings have diverged over time. Ethnicity differs from race in that individuals can choose how they wish to self-identify, and decide whether or not to express the cultural practices associated with their ethnicity.
In contrast, artificial categories of race are imposed upon individuals, and are often based on perceived physical characteristics. Racial categories such as black and white can represent a multitude of ethnicities.
‘Biogeographical ancestry’ is the scientific term used to describe the assignment of genetic ancestry to specific continents, countries or regions. However, this phrase does not easily roll off the tongue, so the companies have tried to use simpler names.
23andMe provides its customers with an Ancestry Composition report while Living DNA provides “recent ancestry results”. FamilyTreeDNA’s report is known as myOrigins.
AncestryDNA and MyHeritage describe their reports as “Ethnicity Estimates”. The term is nicely alliterative, but scientifically incorrect.
Remember that whatever your DNA results tell you about your biogeographical ancestry, it makes no difference to how you self-identify and define your own ethnicity.
How do you trace Jewish ancestry?
The Jewish diaspora is scattered around the world, but for people with Jewish ancestry a DNA test can often provide a reliable indicator of their heritage. This is because Jewish people have traditionally married within their own communities for hundreds of years. So when the scientists try to assign reference populations to clusters, Jewish people stand out as a distinct genetic cluster. This means that the ancestry proportions generally roughly correspond with your known heritage.
If you have one Jewish grandparent, you would expect that about 25 per cent of your DNA would be assigned to a Jewish population. There is a wide variation in the range because of the random way in which autosomal DNA is inherited, so the Jewish percentage might be as low as 15 per cent or as high as 35 per cent. An assignment of 12.5 per cent Jewish might indicate that you have a Jewish great grandparent. Remember that our ancestors sometimes hid their Jewish heritage because of discrimination and prejudice.
Most of the Jewish population in both the UK and the USA are of Ashkenazi Jewish heritage, and it is this group that is best represented in reference populations. Ancestry can even identify six subregions in Eastern Europe for people of Ashkenazi Jewish origin. 23andMe will report Ashkenazi heritage, but no subregions.
MyHeritage has reference populations from Sephardic Jews from North Africa, Mizrahi Jews from Iran and Iraq, Yemenite Jews, Ashkenazi Jews and Ethiopian Jews. FamilyTreeDNA distinguishes between Ashkenazi and Sephardic Jewish heritage, and its updated myOrigins 3.0 report will include four Jewish populations. Living DNA does not currently have any Jewish reference populations.
What is chromosome painting?
Debbie’s 23andMe chromosome painting
If you have distinctive ancestry from a particular location then it can be interesting to map your chromosome information so that you can assign specific segments to particular ancestors. This approach works best at the continental level. For example, if most of your ancestry is British but you have an ancestor from Africa or India then the African or Indian segments will be easy to identify.
23andMe is currently the only company offering an ethnicity chromosome painting feature allowing you to download the segment data for the ethnicity assignments, but FamilyTreeDNA’s myOrigins 3.0 update will also provide access to segment data and an ethnicity chromosome painting feature.
How can DNA testing help with family history research?
The most effective way of using DNA results is to combine your DNA matches with the genealogical records. However, the biogeographical ancestry reports are improving all the time, and can provide useful genealogical information.
Everyone will have their own favourite company whose results correspond most closely with their expectations, and people with European ancestry receive the most granular results. As more people test we can expect to see more reference populations added to the databases, and can look forward to updated results.