Materials & Methods

**Home ** **Instructions ** **Materials & Methods ** **Example Files ** **About **

**Estimating the central TMRCA using the ASD method**

The formula above shows the **A**verage **S**quared
**D**ifference approach used in this application to calculate the

central
TMRCA estimate for a given set of YDNA STR haplotypes. It draws its computational inspiration

from the
statistical analysis section of the Behar et al. (2003) paper and its derivative Matlab/Octave
program

Ytime, but with
some differences/modifications:

Computing the unbiased estimate

**T**(seen in square brackets above), by partitioning the ASD

over each locus using the specific mutation rate for that particular locus(**m**), and then finding

the mean across all the partitions, versus using a single mean mutation rate across all loci to

divide into an overall ASD that was computed across all markers. Note that these two methods

would have identical outcomes if all locus specific mutation rates were set to be equivalent

(for instance when using the Zhivotovsky rates).Simultaneous calculation of TMRCA estimates from 6 different sources of mutation rate

estimates. (see below for specifics).The use of both Median and Modal repeats to represent the ancestral haplotype.

The ability to explicitly specify any marker combination from the full marker set and the

automatic extraction, and computation with these specified markers on the given haplotype

dataset.The absence of confidence interval calculations.

**Mutation Rate Sources**

Theoretically, any published
mutation rate sets could be used in the application, but since it is
also necessary

to maximize the number of markers available for
analysis, a tradeoff has to be made between maximizing the

number of mutation
rate sources and the total markers made available, since most
mutation rate estimates come

for only a limited set of markers.
Therefore, the application uses marker specific
mutation rates* compiled

only from the following sources:

Marko Heinila Mutation Rates: Developed by a genealogical community member, see this thread - Download

Stafford Bayesian Mutation Rates: Essentially a compilation of other mutation rates - Download

The effective mutation rate estimate of Zhivotovsky: - Download *Note, Markers not found in the publication

were normalized using the pedigree relative rates, see procedure here .

**Pre-defined Marker Lists**

While
any marker combination drawn from the full marker list can be used with this application, listed below

are the
Pre-defined marker lists included in the application, these lists come from different publications and

are a convenient way to conduct TMRCA estimates for comparison:

Full_marker_list.txt – The maximum number of markers that overlap with all mutation rate sets.

8_Chiaroni_markerlist.txt - http://www.nature.com/ejhg/journal/v18/n3/full/ejhg2009166a.html

9_Buckova_markerlist.txt - http://onlinelibrary.wiley.com/doi/10.1002/ajpa.22236/abstract

10_Zhiv_markerlist.txt - http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1181912/

14_Plaster_markerlist.txt - http://discovery.ucl.ac.uk/1331901/

22_Bird_markerlist.txt – http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0048638

17_Applied_Biosystems_Y_Filer.txt – http://www.cstl.nist.gov/strbase/kits/Yfiler.htm

Where,

L = Average Branch Length (SNPs)

M = Substitution Mutation Rate (Per Site Per Generation)

G = Years Per Generation

S = Total Sequence Length (Base Pairs)

The simple linear formula above is used to estimate the TMRCA for any node on the Y chromosome

phylogenetic tree. Given a total sequence length, S, of the Y Chromosome, the average branch length,

L, can be estimated by counting the total number of SNPs below the node of interest and dividing by the

total number branches under the node. The years per generation, G , is an assumption that usually varies

between 20 and 40 Years, while the substitution mutation rate, M, can be derived by using several

different methodologies, many times with different outcomes. This application uses mutation rates from

the sources shown below and will look to expand on them as more sources arise.

Publication | Method | Substitution Rate (per site per generation) | ||
---|---|---|---|---|

Central | 95% CI | |||

Lower | Upper | |||

Kuroki et al. (2006) | Chimp Comparison | 4.50 X 10^{-8} |
2.30 X 10^{-8} |
6.30 X 10^{-8} |

Xue et al. (2009) | Deep-rooting Pedigree | 3.00 X 10^{-8} |
8.90 X 10^{-9} |
7.00 X 10^{-8} |

Poznick et al. (2013)* | Founding Migrations | 2.46 X 10^{-8} |
2.16 X 10^{-8} |
2.76 X 10^{-8} |

Francalacci et al. (2013)* | Founding Migrations | 1.59 X 10^{-8} |
1.26 X 10^{-8} |
2.10 X 10^{-8} |

Mendez et al. (2013)** | Autosomal Adjustment | 1.85 X 10^{-8} |
8.78 X 10^{-9} |
2.83 X 10^{-8} |

Scozzari et al. (2013)* | Autosomal Adjustment | 1.92 X 10^{-8} |
1.41 X 10^{-8} |
2.46 X 10^{-8} |

Fu et al. (2014)* | Ancient DNA | 2.40 X 10^{-8} |
2.10 X 10^{-8} |
2.70 X 10^{-8} |

Karmin et al. (2015)* | Ancient DNA | 2.22 X 10^{-8} |
1.89 X 10^{-8} |
2.85 X 10^{-8} |

Trombetta et al. (2015)* | Ancient DNA | 2.148 X 10^{-8} |
N/A | N/A |

** Used 30,20 and 40 Years/Generation respectively to convert Central, Lower and Upper rate estimates.