3. How to run the program
This section describes how to run the program through the command line, and a description of the input parameter file needed to run it.
3.1 Command line arguments
To run the program type on the command line:
>> ancestrymap –pv paramfile or
>>./ancestrymap –pv paramfile
p: is a compulsory option, and in this case we have to specify the parameter file paramfile.
v: version number, this tells us which version of the program we are using. This number can be modified by the user in the file ancestrymap.c
To redirect the output to a file one would type on the command line:
>>./ancestrymap –pv paramfile > out.dat&
3.2 Description of the parameter file
The format of this file is as follows:
Parname: parvalue
>>seed: 200
Note: All the parameter names should be in lowercase, and there should be no white space between parname and semicolon. The parameters which are compulsory are the names of the files that contain marker, individual and genotype data; and the risk model. Parameters which are of the type array should have their values space separated. A sample parameter file is included as part of the download, and a detailed description of the parameters is as follows:
Parameter Name |
Data type |
Description |
Possible and Default values |
INPUT FILE NAMES |
|||
indivname (MANDATORY) |
String |
Individual data |
|
badsnpname |
String |
List of markers to delete from analysis |
|
genotypename (MANDATORY) |
String |
Genotype data for all the samples |
|
snpname (MANDATORY) |
String |
Marker data |
|
ANCESTRYMAP PARAMETERS |
|||
risk (MANDATORY) |
Double array |
Risks for the various models |
Default: 2.0 |
numiters |
Int |
Number of follow-on iterations |
Positive integer >= 0 Default: 5 |
numburn |
Int |
Number of burn-in iterations |
positive integer >= 0 Default: 1 |
reestiter |
Int |
Controls number of iterations inside ancestrymap for allele freq sampling |
positive integer >= 1 Default: 1 |
details |
Boolean |
If YES generate additional output |
NO, YES Default: NO |
tlreest |
Int |
Always set to YES, don't need it |
0,1 |
noxdata |
Boolean |
If you have no X chromosome data or want to ignore it |
NO, YES Default: NO |
fakespacing |
Double |
The spacing between fake markers in Morgans |
positive > 0 Default: 0.01 ( in Morgans) |
seed |
Int |
Random number needed for the run |
Positive integer |
checkit |
Boolean |
If YES runs lots of checks (mostly done initially) |
NO, YES Default: NO |
thxpars |
Double array of size 3 |
Sets the initial parameters for the prior distribution for θX |
Default: 40.0 1.0 10.0 |
thpars |
Double array of size 2 |
Sets the initial parameters for the prior distribution for θ. |
Default: 1.0 5.0 |
lampars |
Double array of size 2 |
Sets the initial parameters for the prior distribution for λ. |
Default: 1.0 0.1 |
lamxpars |
Double array of size 2 |
Sets the initial parameters for the prior distribution for λX |
Default: 1.0 0.1 |
dotoysim |
Boolean |
If YES run simulations |
NO, YES Default: NO |
markersim |
Int |
This is the marker number of the disease allele, -1 means none |
-1 or positive integer Default: -1 |
simnumindivs |
Int |
Generate toy data with simnumindivs number, half will be cases, and half controls. Half are female and half are male |
Positive integer Default: -1 |
risksim |
Double |
In simulation mode risk used to generate data |
Default: 1.0 |
tauscal |
Double array |
Initial values of t(African) & t(European) |
Default: 100 100 (Note this is a lower value than we expect, however we prefer to bias the initial value to be low) |
wrisk |
Double array |
Allows the model to have weights, which are normalized to sum to 1 |
Default: 1.0 |
lrisk |
Double |
In checkit mode: leave one marker out in turn and this is the risk that we use (in checkit mode: only one model risk is used). |
Default: -1.0 |
controlrisk |
Double array |
Control risks for the various models |
Default: 1.0 |
risk2 |
Double array |
Risk for ethnic homozygotes for various models, controlrisk and risk2 are optional, however they should be same number as risk if they are specified |
Default : -1.0 |
taulsdev |
Double |
Prior standard deviation for african & European t values |
Default: 0.5 |
taulmean |
Double |
Prior mean for log10(t) for both African and European |
Default: 2.0 |
allmale |
Boolean |
Used in simulation mode. If YES it specifies that all the simulated individuals should be men. Need to specify the parameter simnumindivs to make this parameter effective |
0,1 Default: NO |
allcases |
Boolean |
If YES all the samples are cases |
NO, YES Default: NO |
usecontrols |
Boolean |
If NO controls are ignored |
NO, YES Default: YES |
pubfmodern |
Boolean |
Publish ancestral allele frequency estimates, if YES allows publication of modern allele frequencies |
NO, YES Default: NO |
OUTPUT FILE NAMES (Note that the directory in which the output files are to be generated should exist, else the program will fail) |
|||
trashdir |
String |
Used only in checkit mode: directory to store HMM output |
|
thetafilename |
String |
Ancestry information for all individuals |
|
output |
String |
Parameter values at every iteration |
|
pubxname |
String |
Debug file for a particular marker |
|
ethnicfilename |
String |
Average ethnicity (/g) for each marker, averaged over all individuals and iterations |
|
snpoutfilename |
String |
Detailed marker information |
|
indoutfilename |
String |
Detailed individual information |
|
freqfilename |
String |
Allele frequency information for all markers |
|
lambdafilename |
String |
λ information for all individuals |
|
genotoyoutfilename |
String |
Genotype data generated in simulation mode |
|
indtoyoutfilename |
String |
Individual data generated in simulation mode |
The software makes it possible to test for several disease models simultaneously. If one is studying a disease for which there is an epidemiological reason to believe that there is higher genetic risk in population A, one might want to test several models for increased risk due to population A ancestry and, simultaneously test one model where population B ancestry confers greater risk. This is implemented by inputting the parameter risk as an array with values both greater and less than 1, for example:
>>risk: 0.8 1.2 1.3 1.4 1.5 1.6