Output Details | David Reich Lab

6.2.2 Output Details

Next we will discuss the output in detail to the standard output, in the case where details = YES, checkit = NO with finite number of burn-in and follow-on iterations. This output can be redirected to a file for easier viewing.

Input parameter file name
Values of all the parameters specified in this file
Genetic distance for all chromosomes:
Total genomic distance
Count of individuals, cases, controls and ignores used in the analysis; and also the number of real and fake markers

parameter file: param2

output: outscore.dat

### THE INPUT PARAMETERS

PARAMETER NAME: VALUE

risk: 1.5

indivname: indiv1.dat

snpname: snpcnts

genotypename: geno.dat

badsnpname: badsnps

OUTD: outfiles

tlreest: YES

seed: 1011

splittau: YES

fancyxtheta: YES

checkit: NO

details: YES

output: outscore.dat

numburn: 50

numiters: 100

emiter: 30

dotoysim: NO

cleaninit: YES

reestiter: 5

thetafilename: outfiles/theta.out

lambdafilename: outfiles/lambda.out

freqfilename: outfiles/freq.out

indivoutname: outfiles/ind.out

snpoutname: outfiles/snp.out

ethnicfilename: outfiles/ethnic.out

hiclip: 100

## ANCESTRYMAP version: 6210

###GENETIC DISTANCE FOR ALL CHROMOSOMES

##Chr_Num: chromosome num, First_SNP and Last_SNP: First and last markers, Gen_dist: Genetic distance

Chr_Num First_SNP Last_SNP Gen_dist

chrom: 1 first: 0 last: 429 distance: 2.834

chrom: 2 first: 430 last: 831 distance: 2.643

chrom: 3 first: 832 last: 1163 distance: 2.227

chrom: 4 first: 1164 last: 1481 distance: 2.131

chrom: 5 first: 1482 last: 1787 distance: 2.012

chrom: 6 first: 1788 last: 2071 distance: 1.914

chrom: 7 first: 2072 last: 2360 distance: 1.871

chrom: 8 first: 2361 last: 2619 distance: 1.670

chrom: 9 first: 2620 last: 2872 distance: 1.777

chrom: 10 first: 2873 last: 3141 distance: 1.809

chrom: 11 first: 3142 last: 3381 distance: 1.552

chrom: 12 first: 3382 last: 3638 distance: 1.723

chrom: 13 first: 3639 last: 3829 distance: 1.258

chrom: 14 first: 3830 last: 4006 distance: 1.159

chrom: 15 first: 4007 last: 4189 distance: 1.245

chrom: 16 first: 4190 last: 4382 distance: 1.340

chrom: 17 first: 4383 last: 4565 distance: 1.266

chrom: 18 first: 4566 last: 4738 distance: 1.160

chrom: 19 first: 4739 last: 4895 distance: 1.069

chrom: 20 first: 4896 last: 5051 distance: 1.067

chrom: 21 first: 5052 last: 5139 distance: 0.604

chrom: 22 first: 5140 last: 5244 distance: 0.710

chrom: 23 first: 5245 last: 5426 distance: 1.180

total distance: 36.224

calling setstatus

emiter: 30

reestiter: 5

###COUNTS

Num of fake Markers: 3622 Num of real Markers: 1805 Spacing between fake markers: 0.010

Num of Markers: 5427 Num of Samples: 1201

Num of Cases: 600 Num of Controls: 600 Num of Ignored Samples: 1

Score generated for each iteration by the expectation maximization algorithm. One should observe the score increasing with the number of iterations.
Results of the Markov Chain Monte Carlo iterations, which include estimation of θ and l. Note that the iteration number goes from 1 –numburn to 0 for the burn-in iterations and from 1 to numiters for the follow-on iterations. Also the score is zero for the burn-in iterations, since we calculate it only for the follow-on iterations. The format of the output for the estimation of θ and l is as follows:
estglob theta iter a1 b1 a2 b2 c2
estglob lambda iter p1 lambda1 p2 lambda2 lambdave
The above parameters are "global parameters" (affect every individual). See supplementary note 2 of the Patterson et. al. 2004 paper for definitions.
lambdaave is the average λ across individuals.

The format of the output which begins with bigiter is as follows:

bigiter iter ylike LOD sc. tau(A) tau(E) thetaave lambdaave xtave(M) xlave(M) xtave(F) xlave(F)
ylike is the slowly mixing statistic of little intrinsic interest described in the above mentioned supplementary note. xtave(M) and xtave(F) are the average θ on X chromosme for males and females respectively, xlave(M) and xlave(F) are the average l on X chromosome for males and females respectively.

Posterior estimates for the mean and standard deviation of θ, θX, λ, λX and t(Afr), t(Eur). The user should look at the values of t(African) and t(European) carefully, since these are an indicator of how well the ancestral models fit the data. It is worrisome if these values are less than 100.
Genome-wide scores for all the models

##SCORES FROM EXPECTATION_MAXIMIZATION ALGORITHM ITERATIONS

## Iteration_Num Score

emsimple iter: 1 0.000

emsimple iter: 2 78381.771

emsimple iter: 3 100906.993

emsimple iter: 4 108153.527

emsimple iter: 29 113979.572

emsimple iter: 30 114004.635

muval1: 0.000

neil0: 5.927 -3.414

domcm1 time: 25.150

##RESULTS FOR EACH MARKOV CHAIN MONTE CARLO ITERATION

##estglob theta: Iteration_Num thp1 thp2 thxp0 thxp1 thxp2

##thp1, thp2: Are parameters for the prior distribution of theta, and thxp0,thxp1,thxp2 are the same for theta on X chromosome

##estglob lambda: Iteration_Num lp1 lp2 lxp1 lxp2 ave_lambda

##lp1, lp2: Are parameters for the prior distribution of lambda, and lxp1,lxp2 are the same for lambda on X chromosome

estglob theta -49 1.894 7.204 1.054 9.168 40.026

estglob lambda -49 13.857 2.725 10.599 2.063 5.086

domcm1 time: 25.170

estglob theta -48 2.024 7.845 1.147 9.027 37.570

estglob lambda -48 16.798 3.267 10.370 2.029 5.141

estglob theta 98 2.080 8.646 2.372 12.278 51.136

estglob lambda 98 16.088 2.691 9.548 2.171 5.963

estglob theta 99 2.065 8.454 2.389 11.843 53.460

estglob lambda 99 18.687 3.146 9.201 2.041 5.985

estglob theta 100 2.005 7.963 2.270 12.302 51.769

estglob lambda 100 16.885 2.828 9.888 2.216 5.973

average thetax: 229.264

###POSTERIOR ESTIMATES

theta mean 0.1986

thetax mean 0.1911

theta var 0.0141 sdev: 0.1189

thetax var 0.0124 sdev: 0.1114

lambda mean 5.9800

lambdax mean 4.5341

lambda var 1.9976 sdev: 1.4134

lambdax var 23.0519 sdev: 4.8012

tau (PopA) 108.270

tau (PopB) 114.625

###GENOME_WIDE SCORE FOR ALL THE MODELS

##risk1 and risk2 are the increased risk due to having one or two population A ancestry alleles, and crisk: risk for controls

risk1 risk2 crisk score

model: 1.500 2.250 1.000 13.633

θ/M and λ values for all individuals
Allele frequency estimates with standard error:
Lag and correlations
For a number of sample statistics we compute a correlation coefficient at small "lags". If the statistic at iteration i is S(i) we compute for 1 <= lag <= 10 (default) the correlation between S(i) and S(i+lag). Large values indicate that the MCMC is not mixing very well.
We publish this for:
- llike: a statistic of no intrinsic interest but mixes poorly.
- log10fac: Log_10 Bayes factor (genome wide)
- factor: Bayes factor = 10^log10fac
- log tauscal: log (t(0)) the t value for population 0.

In our experience ii), iii) are the most important statistics which mix well, iv) mixes less well and i) mixes quite poorly.

###THETA or M, LAMBDA VALUES FOR ALL INDIVIDUALS

##Indiv_Index: individual's internal index num, tmean and txmean: average theta and thetax

##tsdev and txsdev: standard deviation for theta and thetax

##lmean and lxmean: average lambda and lambdax

##lsdev and lxsdev: standard deviation for lambda and lambdax

Num Indiv_ID Gender tmean tsdev txmean txsdev lmean lsdev lxmean lxsdev

0 toyindiv:0 M 0.214 0.028 0.206 0.063 5.470 0.750 5.156 1.469

1 toyindiv:1 F 0.112 0.016 0.127 0.042 7.202 0.705 4.195 1.431

2 toyindiv:2 M 0.221 0.027 0.218 0.054 5.712 0.532 5.901 1.877

3 toyindiv:3 F 0.183 0.032 0.201 0.053 3.656 0.539 5.388 1.554

4 toyindiv:4 M 0.289 0.033 0.258 0.056 5.018 0.760 4.308 1.549

###ALLELE FREQUENCY ESTIMATES WITH STANDARD ERROR

##SNP_Index: marker internal index num

##amean and bmean are the average reference allele frequency for population A and B

##asdev and bsdev are the corresponding standard deviation

SNP_Index Chr_Num SNP_ID amean asdev bmean bsdev

0 1 rs819980 0.948 0.006 0.030 0.016

1 1 rs10907185 0.252 0.010 0.680 0.024

4 1 rs897634 0.090 0.006 0.782 0.020

6 1 rs2817159 0.950 0.006 0.074 0.015

9 1 rs1181868 0.910 0.008 0.241 0.022

12 1 rs7548756 0.830 0.009 0.209 0.022

16 1 rs2012852 0.823 0.010 0.297 0.023

###LAG AND CORRELATIONS

llike mean: -37738.643 s.err: 1552.579

lag: 1 corr: 0.356 sig: 3.544

lag: 2 corr: 0.221 sig: 2.188

lag: 3 corr: 0.264 sig: 2.599

lag: 9 corr: -0.071 sig: -0.674

lag: 10 corr: -0.131 sig: -1.241

Scores for each marker
Scores for each chromosome, and as one can clearly see from the below example, the LGS_MAX and CCS_MAX scores are the highest for chromosome number 3.
Bestscores: The maximum genome-wide score for the locus-genome statistic, and the maximum and minimum genome-wide scores for the case-control statistic.
Genome-log-factor: log-likelihood of the locus genome statistic averaged over all the markers in the genome. The genome-log-factor is the most important number that is produced by the program and should be the first number that the user looks at.

###SCORES FOR EACH MARKER (fakes used for global score)

##LGS: locus genome statistic score, CCS:case control statistic

###SNP_Index Chr SNP_ID Phys_Pos Gen_Pos LGS CCS G(case) G(control) rpower

0 1 rs819980 1510967 0.032 -3.574 0.647 0.208 0.197 0.800

1 1 rs10907185 1765381 0.035 -3.620 0.615 0.208 0.197 0.787

2 1 fake-1:0 2002021 0.040 -3.593 0.667 0.208 0.197 0.772

3 1 fake-1:1 2516231 0.050 -3.686 0.769 0.208 0.195 0.774

4 1 rs897634 2858849 0.057 -3.862 0.824 0.208 0.194 0.806

5 1 fake-1:2 2959423 0.060 -3.729 0.966 0.208 0.192 0.805

6 1 rs2817159 3172565 0.067 -3.562 1.253 0.209 0.189 0.831

5422 23 rs7054554 150935421 1.185 -3.943 0.232 0.186 0.186 0.720

5423 23 fake-23:3620 151499820 1.190 -3.918 0.204 0.185 0.187 0.701

5424 23 fake-23:3621 152532006 1.200 -3.919 0.152 0.185 0.187 0.683

5425 23 rs10127175 152805019 1.203 -3.929 0.138 0.185 0.187 0.681

5426 23 rs884840 153491735 1.206 -3.943 0.119 0.184 0.188 0.680

###SCORES FOR EACH CHROMOSOME

##LGS_MAX: Maximum locus genome statistic score

##CCS_MAX and CCS_MIN are the maximum and minimum case control statistic scores

##LGS_LOCAL: log likelihood of the locus genome statistic score obtained by averaging over all the markers on that chromosome

Chr_Num LGS_MAX CCS_MAX CCS_MIN LGS_LOCAL

1 -0.35 1.79 -2.23 -2.06

2 16.60 7.31 -1.04 14.77

3 -2.94 1.69 -0.80 -4.21

4 -2.59 1.77 -1.83 -4.02

5 -3.22 1.28 -2.11 -4.56

6 -2.72 0.59 -2.23 -4.03

7 -0.97 2.29 -0.79 -2.29

8 -3.62 0.30 -2.46 -4.76

9 -1.64 1.17 -1.11 -3.05

10 -3.15 1.31 -1.93 -4.45

11 -1.99 2.34 -1.22 -3.18

12 1.32 2.25 -2.16 -0.09

13 -1.29 2.40 -1.27 -2.33

14 -1.08 1.42 -1.41 -2.30

15 -2.61 0.36 -2.07 -3.99

16 -2.72 0.27 -2.34 -3.77

17 -0.10 0.96 -0.96 -1.52

18 -1.12 2.24 -2.90 -2.32

19 -0.81 2.38 -0.98 -2.50

20 -0.20 1.01 -0.42 -1.92

21 -1.36 3.01 -0.19 -2.95

22 -3.39 2.14 -1.13 -4.36

23 -1.51 2.44 -1.67 -2.50

###BESTSCORES: Maximum genome-wide score for the locus-genome statistic (LGS_MAX), and the maximum and minimum genome-wide scores fo

r the case-control statistic (CCS_MAX and CCS_MIN)

bestscores: 16.600 7.310 -2.900

###GENOME LOG FACTOR: log-likelihood of the locus genome statistic averaged over all the markers in the genome

genome log-factor: 13.633

##end of run