Cutting through the DNA gobblegook

The problem with DNA is that the more you read, the more you get confused. Here is an attempt to get a perspective on the thing and to explain the terms that crop up again and again. We start on the assumption that you know that DNA is the substance that conveys the genetic blueprint from parents to children. Also, you probably know that it has the form of a double helix and there is a copy inside all our cells except red blood cells.

DNA Basics

DNA stands for Deoxyribonucleic Acid and this is made up of 23 pairs of chromosomes. In turn the chromosomes are made up of four bases adenine, guanine, thymine and cytosine, known by their initial letters A,G,T and C. In fact they appear as base pairs C with G and T with A. The famous double helix is two chains of sugar and phosphate which hold the base pairs between them. A sequence of DNA can be described as AATTGCCTTTT etc.

Getting the thing into perspective

Each human has 3 thousand million bases and trying to find out each individuals complete sequence would take years and years at best. Different applications need to look at different parts of this complete sequence. The part required to tell if the blood on the victim's carpet is from the defendant is not the same as the DNA that follows the female line and tells us that humankind is descended from 6 females who came out of Africa.

Y Chromosome DNA

The DNA that is the most useful for a single name study lies in the Y chromosome. Remember there are 23 pairs of chromosomes; half of the X-Y sex pair in a male (or 1/46 of the total) is the Y chromosome. But the Y chromosome is smaller than the average (after all just what does a father need to pass down to his son: how to operate a video remote control, the principles of selective deafness, how to re-assemble a kitchen appliance he has never seen before, how to accurately estimate bust measurement from a 10 millisecond glance and a fine sensitivity to the taste of fermented hops - and that is about it!!). Judging by the illustrations, the Y chromosome is about 25% of the average size so only 1/200 of the DNA is contained in the Y chromosome. This is still 15 million bases.

Why don't men all have the same Y chromosome?

The Y chromosome is handed down from father to son, so why do all men not have the same Y chromosome as Adam? The answer to that is mutation. Variations in the Y chromosome from one individual to another are called polymorphisms The Y chromosome, like all DNA, contains not only genic DNA or coding DNA but also 97% that is not used for this task and does not seem to have any useful purpose. It is therefore called junk DNA or non-coding DNA.
Alec Jeffreys (1984) was the first to notice that at many points or loci (singular locus) on chromosomes, sequences of DNA would repeat but the number of repeats differed between individuals. In 1995 Mark Jobling found a type called microsatellites which were highly variable and could be used to distinguish between different males. These microsatellites are short sequences between two and five bases long which repeat over and over again e.g., GATAGATAGATAGATAGATA. The number returned in a genealogical DNA test is the number of repeats of the sequence.

Quite a few sites of these microsatellites (sometimes referred to as markers) have been found and each one mutates at around 1 in 500 copies although some are known to mutate faster. If results are obtained from 25 such sites then down a line of descent father to son, a mutation would be expected once every 500/25 = 20 generations. Statistics have a part to play here since the mutation is poisson distributed, so instances of TWO mutations from a father to son have been encountered. Each site of a microsatellite is numbered but numbered in the order in which they were first identified and NOT in their physical order along the DNA chain.

Typical test result

The numbers across the top identify the marker or loci of the microsatellite. DYS stands for DNA, Y chromosome, Segment. DYS393 for example is widely used by all the firms who offer DNA testing for genealogy. The complete set of results is called a Haplotype
Typical 12 Marker DNA test result
DYS393DYS390DYS394DYS391DYS385aDYS385bDYS426DYS388DYS439DYS389iDYS389iiDYS392
132414121115121212132913
Markers 385a and 385b are known to mutate faster than the average and therefore are helpful to split family groups into separate branches.