Fine-Mapping Output

8.3 Fine-Mapping Output

 In this section we will discuss in detail the output generated to standard output, in the case where details = NO, checkit = NO; with finite number of burn-in and follow-on iterations. This output can be redirected to a file for easier viewing.

  • Input parameter file name
  • Values of all the parameters specified in this file
  • Total genomic distance
  • Count of individuals, cases, controls and ignores used in the analysis; and also the number of real and fake markers

parameter file: par:8008

output: /home/at55/ancestrymap-rel/exampletry/outfiles/outlm:8008

### THE INPUT PARAMETERS

PARAMETER NAME: VALUE

risk: 1.5

DIR: /home/at55/ancestrymap-rel/exampletry

TAG: 8008

indivname: /home/at55/ancestrymap-rel/exampletry/indiv1.dat

snpname: /home/at55/ancestrymap-rel/exampletry/snpcnts

genotypename: /home/at55/ancestrymap-rel/exampletry/geno.dat

badsnpname: /home/at55/ancestrymap-rel/exampletry/badsnps

fakespacing: .01

tlreest: YES

OUTD: DIR/outfiles

seed: 8008

splittau: YES

fancyxtheta: YES

output: /home/at55/ancestrymap-rel/exampletry/outfiles/outlm:8008

trashdir: /home/at55/trashdir

checkit: NO

details: YES

numburn: 50

numiters: 100

emiter: 30

dotoysim: NO

cleaninit: YES

reestiter: 5

indoutfilename: NULL

snpoutfilename: /home/at55/ancestrymap-rel/exampletry/outfiles/snps:8008

localoutfilename: /home/at55/ancestrymap-rel/exampletry/outfiles/details:8008

lmmodel: YES

lmchrom: 2

lmnumx: 100

lmmax: 6.0

lmthresh: 0.0

lmdetails: YES

lmlobase: 122544286

lmhibase: 123098515

oldlmmode: NO

markername: rs6750983

pubxname: /home/at55/ancestrymap-rel/exampletry/outfiles/gams:8008

hiclip: 20

## ANCESTRYMAP version: 6210

###GENETIC DISTANCE FOR ALL CHROMOSOMES

##Chr_Num: chromosome num, First_SNP and Last_SNP: First and last markers, Gen_dist: Genetic distance

Chr_Num First_SNP  Last_SNP  Gen_dist

chrom:     1  first:     0  last:  429 distance:     2.834

chrom:     2  first:   430  last:  831 distance:     2.643

chrom:     3  first:   832  last:  1163 distance:     2.227

chrom:     4  first:  1164  last:  1481 distance:     2.131

chrom:     5  first:  1482  last:  1787 distance:     2.012

chrom:     6  first:  1788  last:  2071 distance:     1.914

chrom:     7  first:  2072  last:  2360 distance:     1.871

chrom:     8  first:  2361  last:  2619 distance:     1.670

chrom:     9  first:  2620  last:  2872 distance:     1.777

chrom:    10  first:  2873  last:  3141 distance:     1.809

chrom:    11  first:  3142  last:  3381 distance:     1.552

chrom:    12  first:  3382  last:  3638 distance:     1.723

chrom:    13  first:  3639  last:  3829 distance:     1.258

chrom:    14  first:  3830  last:  4006 distance:     1.159

chrom:    15  first:  4007  last:  4189 distance:     1.245

chrom:    16  first:  4190  last:  4382 distance:     1.340

chrom:    17  first:  4383  last:  4565 distance:     1.266

chrom:    18  first:  4566  last:  4738 distance:     1.160

chrom:    19  first:  4739  last:  4895 distance:     1.069

chrom:    20  first:  4896  last:  5051 distance:     1.067

chrom:    21  first:  5052  last:  5139 distance:     0.604

chrom:    22  first:  5140  last:  5244 distance:     0.710

chrom:    23  first:  5245  last:  5426 distance:     1.180

total distance:    36.224

calling setstatus

lmchrom: 2

lmchrom: 2

setlm:  lmnumx:  100  lmmax:      6.000

markername: rs6750983

emiter: 30

reestiter: 5

###COUNTS

Num of fake Markers: 3622  Num of real Markers: 1805 Spacing between fake markers:     0.010

Num of Markers: 5427   Num of Samples:  1201

Num of Cases:  600 Num of Controls:  600   Num of Ignored Samples: 1

  • Score generated by the expectation maximization algorithm for each iteration. One should observe the score increasing with the number of iterations.
  • Results of the Markov Chain Monte Carlo iterations, which include estimation of θ and l. Note that the iteration number goes from 1 – numburn to 0 for the burn-in iterations and from 1 to numiters for the follow-on iterations. Also the score is zero for the burn-in iterations, since we calculate it only for the follow-on iterations. The format of the output is as follows:
    estglob theta   iter   a1   b1   a2   b2   c2
    estglob lambda   iter   p1    lambda1   p2   lambda2   lambdave
    These are "global parameters" (affect every individual).  See supplementary note 2 of the Patterson et. al. 2004 paper for definitions.
    lambdaave is the average λ across individuals.
  • Posterior estimates for the mean and standard deviation of θ, θX, λ, λX and t(Afr), t(Eur). The user should look at the value of t(African) and t(European) carefully, since they are an indicator of how well the ancestral models fit the data. It is worrisome if we see these value to be less than 100.
  • Genome-wide scores for all the models
  • Theta and Lambda estimates with standard error for all the samples
  • Allele frequency estimates with standard error for all the markers

##SCORES FROM EXPECTATION_MAXIMIZATION ALGORITHM ITERATIONS

##         Iteration_Num    Score

emsimple iter:      1          0.000

emsimple iter:      2      76769.270

emsimple iter:      3      99251.402

emsimple iter:      4     106551.592

emsimple iter:      5     109118.490

emsimple iter:      6     110158.959

emsimple iter:      7     110664.478

emsimple iter:      8     110958.509

emsimple iter:      9     111157.265

emsimple iter:     10     111307.354

emsimple iter:     11     111429.499

emsimple iter:     12     111533.788

emsimple iter:     29     112392.785

emsimple iter:     30     112417.570

muval1:     0.000

neil0:     5.927    -3.414

domcm1 time:    25.670

neil1:     7.397    -2.749

###RESULTS FOR EACH MARKOV CHAIN MONTE CARLO ITERATION

##estglob theta: Iteration_Num  thp1  thp2  thxp0  thxp1  thxp2

##thp1, thp2: Are parameters for the prior distribution of theta, and thxp0,thxp1,thxp2 are the same for theta on X chromosome

##estglob lambda: Iteration_Num  lp1  lp2  lxp1  lxp2  ave_lambda

##lp1, lp2: Are parameters for the prior distribution of lambda, and lxp1,lxp2 are the same for lambda on X chromosome

estglob theta    -49     1.918     7.752       1.120     8.242    35.115

estglob lambda   -49    12.913     2.482      10.698     2.088     5.203

domcm1 time:    25.670

estglob theta    -48     1.977     7.792       1.282     9.432    36.396

estglob lambda   -48    15.138     2.884      11.301     2.209     5.249

estglob theta     99     2.052     8.444       3.307    26.913   207.078

estglob lambda    99    16.854     2.854      13.094     3.171     5.935

estglob theta    100     2.013     7.949       3.176    27.833   206.392

estglob lambda   100    17.012     2.851      13.486     3.261     5.932

average thetax:   223.978

###POSTERIOR ESTIMATES

theta  mean      0.1988

thetax mean      0.1866

theta  var      0.0141   sdev:     0.1187

thetax var      0.0110   sdev:     0.1049

lambda  mean      5.9529

lambdax mean      4.4653

lambda  var      2.0206   sdev:     1.4215

lambdax var     21.2929   sdev:     4.6144

tau (PopA)   109.716

tau (PopB)   117.473

###GENOME_WIDE SCORE FOR ALL THE MODELS

##risk1 and risk2 are the increased risk due to having one or two population A ancestry alleles, and  crisk: risk for controls

           risk1     risk2     crisk       score

model:     1.500     2.250     1.000      13.591

###THETA or M, LAMBDA VALUES FOR ALL INDIVIDUALS

##Indiv_Index: individual's internal index num, tmean and txmean: average theta and thetax

##tsdev and txsdev: standard deviation for theta and thetax

##lmean and lxmean: average lambda and lambdax

##lsdev and lxsdev: standard deviation for lambda and lambdax

 Num        Indiv_ID Gender    tmean     tsdev     txmean    txsdev        lmean     lsdev     lxmean    lxsdev

   0                toyindiv:0     M     0.217     0.027      0.210     0.046        5.845     0.496      4.715     1.128

   1                toyindiv:1     F     0.107     0.016      0.115     0.037        7.142     0.962      4.489     1.140

   2                toyindiv:2     M     0.223     0.023      0.204     0.043        5.332     0.548      5.040     1.221

   3                toyindiv:3     F     0.185     0.029      0.187     0.043        3.699     0.565      5.134     1.072

1198             toyindiv:1198     M     0.138     0.018      0.135     0.041        7.699     1.072      4.201     1.094

1199             toyindiv:1199     F     0.159     0.022      0.156     0.043        8.169     1.007      4.301     1.099

###ALLELE FREQUENCY ESTIMATES WITH STANDARD ERROR

##SNP_Index: marker internal index num

##amean and bmean are the average reference allele frequency for population A and B

##asdev and bsdev are the corresponding standard deviation

SNP_Index Chr_Num          SNP_ID   amean   asdev     bmean     bsdev

    0   1             rs819980     0.948     0.006        0.023     0.013

    1   1           rs10907185     0.249     0.011        0.683     0.026

    4   1             rs897634     0.090     0.007        0.785     0.023

    6   1            rs2817159     0.951     0.006        0.071     0.016

5425  23           rs10127175     0.104     0.008        0.049     0.013

 5426  23             rs884840     0.984     0.004        0.305     0.029

Here Mu is the genotype risk, and lambda is the allelic risk. For a single copy of a chromosome with local ancestry a and b variant alleles the risk is taken to be exp(a lambda) exp(b mu). In the table shown below, given Mu, lambda is chosen so that the ancestry risk if the allele is unknown is that specified by the risk parameter of the (coarse scan) model, for example the risk here is 1.5 (see Overview section).  The LogScore column (clipped, so the score will not be below 0) is a LOD score for the fine-mapping model against the model where genotype does not correspond to risk.  Note that a positive LogScore is a hint of a causal allele. The reader, as a check on understanding, should note that if mu = 1, then the score must be 0 also as is true in the tableau below (row 15).

lmbayes is a Bayes factor averaging over all fine mapping markers in the run. This really needs adjusting by a prior for whether there is a causal marker in the region.

### Iteration_Num         Mu  Log_Score Caltd_Lambda

lmdetails   0     0.333    -8.000       0.811

lmdetails   1     0.359    -8.000       0.839

lmdetails   2     0.386    -8.000       0.868

lmdetails   3     0.415    -8.000       0.900

lmdetails   4     0.447    -8.000       0.934

lmdetails   5     0.481    -8.000       0.970

lmdetails   6     0.517    -8.000       1.009

lmdetails   7     0.557    -8.000       1.050

lmdetails   8     0.599    -7.924       1.094

lmdetails   9     0.644    -6.300       1.141

lmdetails  10     0.693    -4.327       1.192

lmdetails  11     0.746    -2.758       1.246

lmdetails  12     0.803    -1.576       1.304

lmdetails  13     0.864    -0.746       1.365

lmdetails  14     0.929    -0.229       1.430

lmdetails  15     1.000     0.000       1.500

lmdetails  16     1.076    -0.057       1.574

lmdetails  17     1.158    -0.412       1.653

lmdetails  18     1.246    -1.075       1.737

lmdetails  19     1.340    -2.058       1.826

lmdetails  20     1.442    -3.369       1.920

lmdetails  21     1.552    -5.009       2.020

lmdetails  22     1.670    -6.941       2.126

lmdetails  23     1.797    -7.978       2.238

lmdetails  24     1.933    -8.000       2.356

lmdetails  25     2.080    -8.000       2.481

lmdetails  26     2.238    -8.000       2.612

lmdetails  27     2.408    -8.000       2.750

lmdetails  28     2.591    -8.000       2.895

lmdetails  29     2.788    -8.000       3.048

lmdetails  30     3.000    -8.000       3.207

###lmscore: Fine-mapping score in addition to the Admix_Score

  ##SNP_ID    LMScore    Chr_Num   Phys_Pos Admix_Score

lmscore:      rs11890727    -0.992  2       114383724    14.382

##lmscbest : Best lmscore in the run

lmscbest:    -0.992

##lmbayes: Bayes factor, averaging over all fine mapping markers in the run

lmbayes:     -0.992

  • §         Lag and correlations
    For a number of sample statistics we compute a correlation coefficient at small "lags". If the statistic at iteration i is S(i) we compute  for 1 <= lag <= 10 (default) the correlation between S(i) and S(i+lag). Large values indicate that the MCMC is not mixing very well.
    We publish this for:
    • o       llike:  a statistic of no intrinsic interest but mixes poorly.    
    • o       log10fac:  Log_10 Bayes factor (genome wide)
    • o       factor: Bayes factor = 10^log10fac
    • o       log tauscal: log (t(0)) the t value for population 0.

            In our experience ii), iii) are the most important statistics which mix well, iv) mixes less well and i) mixes quite poorly.

  • §         Scores for each chromosome

As one can clearly see from the below example, the LGS_MAX and CCS_MAX scores are the highest for chromosome number 3.

  • §         Bestscores: The maximum genome-wide score for the locus-genome statistic, and the maximum and minimum genome-wide scores for the case-control statistic.
  • §         Genome-log-factor: log-likelihood of the locus genome statistic averaged over all the markers in the genome.

The genome-log factor is the most important number that is produced by the program and should be the first number that the user looks at.

###LAG AND CORRELATIONS

               llike mean: -32402.892 s.err:  1965.995

lag:  1 corr:     0.629 sig:     6.258

lag:  2 corr:     0.528 sig:     5.224

lag:  3 corr:     0.462 sig:     4.550

lag:  4 corr:     0.440 sig:     4.312

lag:  5 corr:     0.365 sig:     3.560

lag:  6 corr:     0.332 sig:     3.221

lag:  9 corr:     0.068 sig:     0.647

lag: 10 corr:     0.230 sig:     2.178

###SCORES FOR EACH CHROMOSOME

##LGS_MAX: Maximum locus genome statistic score

##CCS_MAX and CCS_MIN are the maximum and minimum case control statistic scores

##LGS_LOCAL: log likelihood of the locus genome statistic score obtained by averaging over all the markers on that chromosome

   Chr_Num  LGS_MAX   CCS_MAX  CCS_MIN  LGS_LOCAL

    1  -0.49    1.79  -2.22   -2.20

    2  16.55    7.31  -1.05   14.73

    3  -2.94    1.68  -0.79   -4.27

    4  -2.89    1.78  -1.81   -4.25

    5  -3.31    1.27  -2.11   -4.59

    6  -2.86    0.59  -2.22   -4.15

    7  -1.01    2.29  -0.78   -2.37

    8  -3.82    0.30  -2.48   -4.90

    9  -1.65    1.17  -1.12   -3.04

   10  -3.24    1.31  -1.93   -4.56

   11  -2.15    2.34  -1.20   -3.24

   12   1.43    2.23  -2.17    0.02

   13  -1.31    2.40  -1.28   -2.30

   14  -1.16    1.42  -1.41   -2.41

   15  -2.78    0.37  -2.07   -4.13

   16  -2.65    0.31  -2.35   -3.67

   17  -0.28    0.96  -0.98   -1.69

   18  -1.40    2.23  -2.90   -2.61

   19  -0.83    2.38  -0.96   -2.51

   20  -0.34    1.00  -0.42   -2.00

   21  -1.65    3.00  -0.20   -3.20

   22  -3.56    2.16  -1.13   -4.41

   23  -1.01    2.46  -1.64   -1.96

###BESTSCORES: Maximum genome-wide score for the locus-genome statistic (LGS_MAX), and the maximum and minimum genome-wide scores fo

r the case-control statistic (CCS_MAX and CCS_MIN)

bestscores:     16.554     7.308    -2.897

###GENOME LOG FACTOR: log-likelihood of the locus genome statistic averaged over all the markers in the genome

genome log-factor:    13.591

##end of run