Home Instructions Materials & Methods Example Files About
If the chosen data type successfully uploads, a button will show up below with the caption, "Process TMRCA for *Your_Filename*", click that button.
Assign a minimum sample size requirement of each subset to be analyzed, or leave blank if all subsets are to be included.
Choose a set of markers from the pre-defined marker list or upload your own marker list file**.
Only if the check box is on will calculations be performed on an uploaded marker list.
"Compute TMRCA" and review the results, or click on the link to downlowad the input data for Fluxus Network Software©,
or just click on the link "Clear and Upload New Y-STR file" to restart the process with a new file.
*A CSV Y-STR input data file should first be made in the same format as any of the Example files shown here , more specifically:
The best nomenclature to use for DYS names with this application can be seen in this file. However, the application is also capable of recognizing other formats, like those found in the
NIST for instance. If in the event that there is a marker number that is not included in the calculation but can however be found in the master marker list, simply just manually change
the unidentified marker/s in your *.csv file to match the format of the marker/s found in the master marker list.
Each row (except the first) should represent one sample/haplotype.
Each column (except the first) should represent repeats for one marker/DYS#.
first column should represent sample identifiers that can be used to group haplotypes together, ex. Geography, language, haplogroup, etc...It is this column that will be used to
filter subsets of haplotypes from the main dataset on request.Note If any of the following characters are found in the identifier they will be replaced with the "_" (underscore) character
before proceeding to the analysis section; ' ' (space), '*' (asterisk ), '/' (forward slash), ',' (comma), '( )' (parentheses)
The cell found in the first row and first column should have the Dataset's name, this will be the same name used throughout the analysis.
Any haplotype that contains Null values or Non-integers for any one of the loci, will be removed automatically before the beginning of the analysis.
file MUST be a *.csv file with commas used for field delimiters
*Up to 67 markers of FTDNA style Y-STR haplotypes, in this order, can be correctly processed for analysis provided that the following steps are met:
Paste haplotype repeats that are tab separated, for instance, items copied from excel spreadsheets or from 'Y-DNA Classic Chart' tables of FTDNA projects. Space separated repeats are not accepted. When copying a set of Y-STR haplotypes, do not include the header with the locus names, simply just highlight the repeats and up to five columns of IDs and paste it into the text box. A minimum of one ID column is required.
Specify the column where the repeat data starts. If 'Auto' is selected, the app will attempt to find the column itself, provided that at least 5 different haplotypes are provided, it is advisable however to manually submit the column at which the repeats start, as this step must be done correctly or analysis will either be erroneous or not be allowed to proceed.
If the pasted haplotypes include repeats in multi marker format; (a-b) for DYS385, DYS459, YCAII, CDY, DYF395S1 ,DYS413 and (a-b-c-d) for DYS464, like the FTDNA project tables do, then the "Parse Multi-Copy Marker" check box needs to be checked. If multi marker locus have already been separated in the pasted text, then there is no need to check this box. Note here that only markers a,b,c and d of DYS464 will be parsed, if repeats exist for DYS464 e,f and g, they will be ignored. Conversely, if pasting text with the multi-marker repeats already parsed and with data that includes repeat values for markers 464 e,f and g, then completely delete those columns before proceeding.
If you want to analyze a specific FTDNA marker panel then choose it here. If no choice is made then data for the default maximum number of markers (= 67) will be generated.
Choose an ID column that the calculator will use for filtering the data during analysis, if no choice is made, then by default all available ID columns will be concatenated to use as a label for each haplotype, consequently, filtering will not be possible during analysis.
Give the dataset a name, if no name is given by the user, then the name 'Generic_YSTR_Data' will be assigned to the dataset.
Click on the Parse button to process the pasted data, if all goes fine, a link will appear entitled 'Review Generated File', which will display the formatted data that will be used to perform the TMRCA computations. It would be good practice to review this file before proceeding to the computations.
Marker list files are simple text files (*.txt) with a single DYS
number written per row, several examples of Marker text files can be
the Pre-Defined Markers section in the Materials & Methods page.
The General results area below the submission form is divided into two parts:
The first part
establishes baseline information for the entire dataset including the active STR and marker files, the sample size, DYS#'s
used and the mean TMRCAs.
The second part tabulates the
results for each unique subset. The first column of the table simply shows the name of the subset as
assigned in the filter column of the Y-STR file. The second column is for the sample size of the subset. The third column shows the
ratio of the number of haplotypes in the subset, relative to the total number of haplotypes in the entire dataset. The fourth column,
Z-TMRCA, shows the mean TMRCA in generations using the Zhivotovsky rates. The last column, P-TMRCA, shows the mean TMRCA
in generations using all the available pedigree rates. All of the columns are sortable in ascending or descending order.
Clicking on any one of the subset's name in the filter column will open a new page with the detailed analysis results for the selection.
The Detailed results page is divided into four sections:
The first part
outlines general information on the input files, the fields denoted
as Active, are the files that were used to perform
the computations on, they will also be the same files that will be used again in the next round of computations if no specific
changes are requested through any of the fields in the input submission form.
The Dataset field, comes from the YSTR (*.csv) file and is what the user decided to name the dataset. If the User has specified a filter,
then the filter name will also be attached to this field.
The Sample Size, simply shows the total number of haplotypes used to carry out the analysis.
The second part outlines Marker information, specifically which DYS numbers were used in the analysis, which were excluded and why.
The next part gives
the central TMRCA estimate, T, per each unique mutation rate set in
units of Generations. If the modal and median
repeats of the dataset are not equivalent, then the results of T will be divided into 2 assumptions, an assumption that the ancestral haplotype
can be represented with the modal repeats of the dataset and another assumption that it could be represented by the median repeats.
However, if the median and modal repeats were found to be equivalent, then only one estimate for T will be printed.
Finally, the last
part summarizes TMRCAs in units of years before the present for an assumed length of a
generation and for
all the mutation rate sets used except the Zhivotovsky rates.
Additional Notes : The application does not maintain an active database,
all uploaded files, other than being kept briefly during a session
to accommodate analysis on the most recently uploaded dataset, are completely deleted on a routine basis. It is therefore recommended to
save the results somewhere else once each round of analysis has completed.
Enter the total sequence length in units of 1 million (Mb) or 1 thousand (Kb) base pairs
Enter the Years / Generation
Choose Analysis Type
If you know the Average Branch Length, L, and want to calculate the TMRCA, then
choose 'L to TMRCA' and input the Average Branch Length in SNPS.
If you know the TMRCA and want to calculate the Average Branch Length, L, or compute
the TMRCA using other mutation rate sources, then choose 'TMRCA to L' and input
the TMRCA in KYA and the corresponding rate source used to calculate it.
Submit and review the results
Results are given in a tabular format below the input submission form
All TMRCAs are given in KYA (= X 1000 Years before the present)
The lower and upper bound TMRCA results are based on the 95% CI, which were derived in the original/source substitution rate publications
The source, as well as the central TMRCA estimate columns are sortable