AFAOA DNA Project FAQs
• What is the purpose of the Project?
• What are DNA, chromosomes and genes?
What is the purpose of the Project?
The Project's purpose is to create a Profiles Table that can be used by Austin family researchers to "look up" their Austin line. Each line in the Table characterizes a documented Austin line through its Y-chromosome DNA profile, a set of 37 or 67 numbers. Researchers may compare their own Austin line Y-DNA numbers to those in the Table to determine whether or not they have any connection to a documented Austin line.
Who pays for DNA tests for Austin lines?
Approximately 80% of the Austin line DNA testing done to date has been paid for by individual researchers who were interested in discovering or confirming their Austin lines. Such tests are usually uncoordinated, so in order to complete the Profiles Table in a more systematic manner, the Austin-Austen DNA Project Fund was created. This Fund has selected the DNA donors and paid for about 20% of the tests done to date.
Can the Project help me identify my Austin line?
Perhaps. By comparing your DNA results to those in the Profiles Table, you will certainly learn what Austin lines you do NOT belong to. This information can also be useful, and could save you from wasting time pursuing a wrong Austin line. If you are fortunate, you might find your DNA matches a documented Austin line. The chances of such a match should increase over time as more DNA test results are added to the Profiles Table.
Can I contribute my DNA to the Project?
Yes, the Project welcomes participation by all Austin lines. Since Y-DNA passes only from father to son (and not to daughters), the DNA donor must be an Austin male. If you are not an Austin male, you must find an Austin male relative (brother, father, uncle, cousin, etc.) who is willing to contribute DNA samples for your Austin line.
How can I participate in the Project?
To participate in the Project, just visit the Family Tree DNA web site and search for the surname "Austin" to locate our Austin-Austen DNA Project page, then fill out and submit the Join Request form found there. Once this is received by the Project Administrator, you will receive instructions for submitting information concerning your known Austin line, from your earliest documented ancestor down to the DNA donor himself. After submitting your Austin line, you will be able to order a DNA Test Kit from Family Tree DNA at our low Project price.
How much will my DNA testing cost?
Family Tree DNA offers processing to characterize your DNA at 12, 25, 37, 67 or 111 different locations along the Y-chromosome. The Project encourages 37-marker processing (Test Kit Y-DNA37) for $149 plus $4 postage for participants in our Project (the normal price is $169 plus $4 postage). The less expensive 12-marker processing for $99 is of limited use, but is helpful for disproving connections. The pricing used by Family Tree DNA is set up to lower the cost per marker as more markers are tested. Example a 12 marker test cost $8.15 per marker where a 111 marker test is $3.05 per marker.
How does DNA testing work?
After receiving his DNA Test Kit from Family Tree DNA, the DNA donor simply scrapes the insides of his cheeks to obtain two samples. The kit consists of a cheek scraper and a collection tube. In about five minutes, you will be able to read the instructions and perform a painless cheek scraping. The effect of using the scraper is about the same as brushing your cheek with a soft bristle toothbrush. A backup scraper and tube are included to insure that a good sample is obtained. The samples are mailed to Family Tree DNA for processing, which takes about six weeks. The DNA donor will receive a copy of the results, and his numbers will appear anonymously in the Profiles Table.
My DNA matches the Profile Table… am I done?
No. Your Y-DNA match shows only that you are somehow related to a specific documented Austin line. DNA does not show HOW you connect into that line. You still need to pursue traditional genealogical "paper trails" to establish your exact connection to the documented Austin line.
What are DNA, chromosomes and genes?
The fundamental pattern for the physical traits of a human being (or any life form) is found in a complex chemical called deoxyribonucleic acid, or DNA, which is formed of four molecules called bases, bound together by a framework of other molecules (sugars and phosphates) in a structure which looks something like a twisted ladder --- the so-called double helix. The four different bases are called adenine, cytosine, guanine and thymine, abbreviated A, C, G and T respectively. They are strung out along the "legs" of the ladder, with chemical bonds between opposite pairs forming the "rungs." It is the order of these bases along the ladder which determines all of the features and attributes of the human being.
Most human DNA is organized into strands called chromosomes, and along each chromosome are many shorter sections called genes. There are forty-six different human chromosomes, and they are joined together in twenty-three chromosome pairs, with one member of each pair inherited from the mother and one from the father. (Sometimes these chromosome pairs are themselves referred to as chromosomes.) In the first twenty-two pairs, the paired chromosomes contain essentially the same genes, although the mother's version of a gene may be slightly different from the father's. The twenty-third pair consists of two similar chromosomes called the X-chromosome in a woman, but in a man there are two distinctly different chromosomes: an X-chromosome and a smaller, simpler chromosome called the Y-chromosome. These are the sex chromosomes. A child will always inherit an X-chromosome from its mother, but it may inherit an X or a Y from its father. With two X-chromosomes, the child will be a girl; with an X and a Y, it will be a boy.
The set of twenty-three different chromosomes is called the human genome. There are estimated to be from 30,000 to 80,000 genes in the genome, distributed along the twenty-three chromosomes. Except for egg cells, sperm cells and blood cells, each of the millions of human cells contains two copies of the genome within its nucleus, one copy from each parent, joined in pairs as described above.
The abbreviations A, C, G and T are often thought of as a 4-letter code, and the genome is sometimes thought of as a book written in this code, with twenty-three chapters called chromosomes, each chapter containing thousands of stories called genes (and other material, too; not all of a chromosome consists of genes). This analogy was used by Matt Ridley in his popular book Genome. We will refer later to short sections of DNA by writing them as strings of these letters (for example, GATGC).
How is DNA used in genealogy?
The sperm and egg cells each contain one of each kind of chromosome, which may be the one inherited from the mother or that from the father. Since there are two possibilities for each of the 23 chromosomes, there are 223 different possibilities for an egg or sperm cell; this is more than 8 million. In fertilization, one of the 8 million possible sperm cells combines with one of 8 million possible kinds of egg cell, resulting in an enormous number of possible chromosome combinations in the fertilized egg. Furthermore, in the process of cell division leading to the formation of sperm and egg cells, the chromosome pairs within the man and the woman may exchange bits of themselves with their pair-partners, resulting in even more variety in the chromosomes of the resulting child. This enormous variety accounts for the essential uniqueness of each person, and of that person's DNA. This is why most DNA is useful for matching a forensic sample with a suspect, or for determining the paternity of a child. However, this same variability makes most DNA relatively useless in determining relationships more distant than siblings or first cousins, and thus relatively useless for genealogical investigations.
There are two exceptions to this situation, however: mitochondrial DNA which is inherited only from the mother, and Y-chromosome DNA inherited only from the father.
Mitochondria are tiny structures located outside the nucleus of the cell, which have their own DNA. At conception, only the nucleus of the sperm cell enters the egg cell; the remainder, including the sperm cell's mitochondria, is discarded. Thus, the only mitochondria inherited by the child are those of the mother. In principle, a child's mitochondrial DNA is identical to that of its mother, which in turn is identical to that of the mother's mother, and so on back through the ages. This is not quite true in practice, because of rarely occurring changes, or mutations, which may occur from one generation to the next.
A similar situation occurs with Y-chromosome DNA. Since a woman has no Y-chromosome, a boy's Y-chromosome can only come from his father. Furthermore, unlike the other chromosome pairs, the Y-chromosome exchanges no genetic information with its pair-partner the X-chromosome. Thus, but for occasional mutations, a boy's Y-chromosome is identical to his father's, and to his father's father's, and so on. This ancestral line through which the Y-chromosome is inherited will be referred to as the male line. (The term Y-line has also been used for this. In both cases, it should be understood that "father" means the biological father.)
It is the mutations which are key to genealogical investigations. If the mutation rate is known (or can be closely estimated), this information can be used to estimate the number of generations back to a common male-line ancestor for two men whose Y-chromosome DNA is identical or nearly identical (and the same can be done for two women whose mitochondrial DNA is similarly close). Since mutations occur randomly (and the "rate" is only an average rate), the answers to these questions must be stated in terms of probability. Tests of Y-chromosome DNA can be used to answer questions such as:
What are the chances (what is the probability) that two men have a common male-line ancestor within a given number of generations?
• How many generations back must we go to have at least a 50% (or any other given percent) chance that these men have a common male-line ancestor within that number of generations?
Answers to questions such as these cannot prove with certainty that two men actually have a "recent" common male-line ancestor (within, say, at most 10-20 generations), much less identify that ancestor, but a fairly high probability may do the following things:
(1) It can add significantly to our confidence in the correctness of documentary genealogical research which has identified such an ancestor.
(2) It can point to a possibly fruitful direction for future research. For example, if two men, not known to be related, are shown to have a probable common ancestor, then searching for the ancestors of the second man may lead to finding ancestors of the first.
On the other hand, a low probability can suggest that two men are, at best, only very distantly related, that there is little hope of finding a recent common male-line ancestor, or that research identifying such an ancestor is incorrect because at some point the biological ancestor differs from the ancestor of record. In such cases, further DNA testing may help to locate where the break occurred, and thus facilitate discovery of the correct ancestor.
These probabilities were investigated by Dr. Bruce Walsh of the Department of Ecology and Evolutionary Biology, University of Arizona. The mathematical computations of them were presented in the article "Estimating the time to the MRCA for the Y-chromosome or mtDNA for a pair of individuals", Genetics 158: 897-912. Dr. Walsh's results are presented graphically on-line in his Time to Most Recent Common Ancestry Calculator.
What are mutations?
Certain short sections of a DNA molecule may on rare occasions undergo accidental changes in the sequence of bases (A, C, G and T) as the chromosome pairs separate for the formation of sperm and egg cells and then recombine in the fertilized egg. In one kind of mutation, a single base may be replaced by a different base (an A replaced by a G, for example). Such changes occur so rarely that they are very unlikely to be observed within the time frame of interest to genealogists (about 10-20 generations), and we are not concerned with them.
A second kind of mutation occurs at what is called a short tandem repeat (STR): a sequence of a few bases which is repeated a relatively small number of times. An example would be ACTATACTATACTATACTATACTAT, in which the string ACTAT is repeated 5 times. (The notation [ACTAT]5 has been used for this repetition). In a mutation at an STR, an extra repetition may be inserted, or one may be deleted. The Y-chromosome contains many of these STRs, each of which is called a marker. Because of mutations, the number of repetitions at a particular marker may vary from one person to the next.
Each STR marker used in DNA testing has been assigned an identifying number, called its DYS number, where DYS stands for DNA Y-chromosome Segment. The number of repetitions at a marker is called the allele of that marker.
Unlike the single-letter replacement type of mutation (whose technical name is single nucleotide polymorphism, or SNP), which occur so rarely as to be almost unique events, the mutation rate for STRs has been estimated to be between 1 in 500 and 1 in 300 generations, or about .002 to .003 mutations per generation at a particular marker. This is "rapid" enough to allow the observations of mutations within a genealogical time span of 10-20 generations. (Although markers probably have slightly different mutation rates, a single average rate was assumed for all markers as a simplifying assumption in Bruce Walsh's computations.)
What do the numbers mean? (Interpreting the Profiles Table)
The persons whose DNA was tested are identified in the table by their kit number (assigned by Family Tree DNA), the first three generations of the male line from which they were descended (starting with the first known ancestor), and the number of additional generations to the person tested. Beyond these identification columns come from 12 to 37 columns of numbers, one for each marker tested. (The Austin-Austen DNA Project has 37 or 67 markers in most of the tests so far.) Each column of numbers is headed by the DYS number of the marker, and the column shows the allele (number of repetitions) of that marker, for each person tested.
A row of numbers in the table shows the Y-chromosome DNA profile of a single individual, a listing (in a particular order) of the alleles of that person's DNA at each of the markers tested. The technical name for this profile is haplotype.
What does the Profiles Table tell us?
The current table shows that some groups of people, listed in adjacent rows, have nearly identical haplotypes. Two of these groupings illustrate some of the things we can learn from DNA testing.
Three of the listed descendants of Samuel Austin of Boston have exactly the same haplotype, and a fourth (descended through Zebediah) differs by just one number at DYS 449. This probably represents a single mutation, the insertion of an extra repetition, at some point in the line of descent. (Since mutations can delete or add a repetition, canceling each other out, or change the number of repetitions by more than one in a rare event, it is possible but highly unlikely that more than one mutation has occurred). A "non-paternity event" would probably have produced several other differences in the haplotype, since the biological father would be unlikely to have nearly the same DNA as the father of record. The fact that four presumed descendants of Samuel have nearly identical DNA adds a great deal of confidence in the correctness of the documentary research which traced these lines of descent.
What is surprising is that a fifth person, descended from William Austin of Maryland, has the same haplotype as the Samuel line! There is no known relationship between William and Samuel, but from Walsh's graphs it can be estimated that their tested descendants probably have a most recent common male-line ancestor within 12 generations, with a probability of about 75%. Since Samuel and William are already 8-11 generations back from the subjects, there is a very good chance that they have a common ancestor only a few generations further back, probably in England. Samuel researchers have been at a dead end finding his ancestry, but it is possible that finding a William ancestor would solve the problem. (Or vice versa.)
Another grouping of interest is the four presumed descendants of Richard Austin of Charlestown, MA. Three of the subjects have the same haplotype and are therefore almost certainly descended from Richard, but the fourth has a distinctly different haplotype, differing at 19 of the 25 markers, often by more than one step. This person is almost certainly not descended from Richard, at least by the male line. Whether there is an error in the documentation or a non-paternity event remains to be seen. Testing other descendants of Nathaniel whose lines separated from this one at a later generation might give some information about where the break or error occurred, and this in turn might help in discovering the nature of the discrepancy and in finding the true male-line ancestry of the test subject.
It may be noted that a similar problem occurs among the three presumed descendants of Robert Austin of Kingstown, RI. It may also be noted that there are other groups with nearly identical haplotypes (the rows colored in blue) for whom no common male-line ancestor is presently known. This illustrates another outcome from the DNA project: the grouping of presently unattached Austin lines by haplotype into larger groups which may actually have a recent common male-line ancestor: tying up loose ends, so to speak.
Where can I go to learn more?
There are a number of general interest articles and DNA project reports available on the internet which may present an alternate explanation of topics covered briefly here, or present them in more detail, and may also cover many topics not included here. Some of these in turn will provide links to yet other reports. Here is a sampling of such articles.
(1) Blair DNA Project, DNA 101: Y-Chromosome Testing
(2) The Mumma Surname DNA Project
(4) Time to Most Recent Common Ancestry Calculator
(5) Contexo.Info (a long and informative presentation, with many interesting illustrations and graphics, well worth a look).