BADGER

Bayesian Analysis to Describe Genomic Evolution by Rearrangement

Version 1.02 beta, June 11, 2004

Copyright © 2004 by Bret Larget & Don Simon


Control File


The run control file contains all of the run characteristics and tree and parameter initialization information. Below is a run control file with all possible settings. Please note that this file contains all possible run controls, and that if actually used, BADGER will issue warnings about run controls that are inappropriate.


################# BADGER All Controls Run Control File ##################
# Last updated June 29, 2004   Change values as necessary.
# Any text on a line after the symbol `#' is a comment. 
#########################################################################
#
# General controls
#
seed = 194024933                      # Seed for random number generator.
pre-burn = 0                          # Number of pre-burn cycles.
cycles = 10000                        # Number of cycles.
window-interval = 1000                # Interval to print info to screen.
#				      
# Input controls		      
#				      
data-file = infile                    # Name of file with the data.
initial-tree-type = random            # How to create initial tree.
tree-file = treefile                  # File containing initial tree
use-grouping = false                  # Group taxa?
group-list = 1*                       # How to group the taxa.
#				      
# Output controls		      
#				      
sample-interval = 100                 # Number of cycles between trees saves.
file-root = run1                      # Root name of all output files.
outgroup = 1                          # Outgroup for printing trees.
print-all-trees = true                # Print all trees?
print-permutation = false             # Print ancestor's permutation?
#				      
# Parameter controls		      
#				      
estimate-hyper-parameters = true      # Estimate hyper-parameters?
mu = 2.0                              # Expected # of inversions per edge.
psi = 5.0                             # Variance of # of invs. per edge / mean.
#				      
# Algorithm controls		      
#                                     
use-updates = 1,1x,2,2x,3,4,5,5x,6,6x # List of update methods to use.
prob-stop = 0.99                      # Prob. random reversal sequence stops.
prob-good = 0.95                      # Prob. of proposing a good move.
prob-neutral = 0.90                   # Prob. of proposing a neutral move
use-mcmcmc = false                    # Whether or not to use MCMCMC
chains = 4                            # Number of chains to use in MCMCMC.
swap-frequency = 1                    # Frequency for swapping chains 
temperature = 0.05                    # Temperature parameter for MCMCMC.
#				      
# Debugging controls                                  
#				      
check-trees = false                   # Perform tree consistency check?
debug = false                         # Print debugging statements?
debug-cycle = 0                       # When to start printing debugging info.

In addition to this example, the distribution contains several other sample run control files.

All options may be entered into the run control file in any order. The symbol `=' appears after the name of the option followed by the value. White space may separate the run control name, the equal sign, and the value. Any text after the symbol `#' on a line is a comment and will be ignored. All options have default values, which are in boldface in the following table.

Run control file options.
Option Choices Description
seed 1 < integer < 2147483647.
Default is 194024933.
Seed for the random number generator.
pre-burn Any nonnegative integer.
Default is 0.
Number of cycles to run before recording trees. The Hastings ratio is ignored during pre-burn.
cycles Any nonnegative integer.
Default is 10000.
Number of cycles to update the tree(s).
window-interval Any nonnegative integer.
Default is 1000.
Interval at which tree topology and parameter values are printed to screen.
data-file Any valid pathname.
Default is infile.
Pathname to the data file. The data file consists of one line per taxa. Each line is the name of the taxon followed by a comma, and then a comma-separated list of the genes in order. Gene names are any strings.
initial-tree-type random, good, or file.
Default is random.
random selects a tree from the prior;
good performs an algorithm similar to neighbor-joining;
file reads in an initial tree topology from the file specified in tree-file.
tree-file Any valid pathname.
Default is treefile.
Pathname for the file which contains the initial tree. This is only used if initial-tree-type is file.
use-grouping true or false.
Defaults is false.
Whether or not to group taxa into groups.
group-list A valid group list (a comma-separated list of groups with optional repeat counts and optionally one repetitive asterisk).
Default is 1*.
See the section on group lists for more details. This is only used when use-group is true.
sample-interval Any positive integer.
Default is 100.
After the pre-burn the tree topology, log likelihoods, and parameters are written to files at each cycle divisible by this value.
file-root Any valid file name.
Default is run1.
Prefix for the names of all the output files.
outgroup Positive integer with outgroup in order of data file.
Default is 1.
Trees and tree topologies are printed with the outgroup emerging directly from the root.
print-all-trees true or false.
Default is true.
Whether or not to print all sampled trees (not just tree topologies).
print-permutation true or false.
Default is false.
Whether or not to print out the permutation of an internal node of the tree. This makes most sense when there is only one internal node, i.e., the data consists of three taxa.
estimate-hyper-parameters true or false.
Default is true.
Whether or not to estimate the hyper-parameters mu and psi. To estimate the hyper-parameters, a neighbor-joining tree is created. Mu is set to be the mean branch length in terms of the number of inversion, and psi is set to the variance divided by mu, with a minimum value of 1.1
mu Any positive number.
Default is 2.0.
The expected number of inversions per edge of the tree. This is only used when estimate-hyper-parameters is set to false.
psi Any number greater than 1.
Default is 5.0.
The variance of the number of inversions per edge divided by the mean. This is only used when estimate-hyper-parameters is set to false.
use-updates A comma-separated list of update algorithms.
Default is 1,1x,2,2x,3,4,5,5x,6,6x.
The list of tree proposal algorithms, or updates, to be used in each cycle.
prob-stop Any number between 0 and 1, exclusive.
Default is 0.99.
When choosing a new reversal list, the probability of stopping when the beginning and ending permutations are identical. See the section on finding new reversal lists for more details.
prob-good Any number between 0 and 1, exclusive.
Default is 0.95.
When choosing a new reversal list, the probability the next reversal is a good one. See the section on finding new reversal lists for more details.
prob-neutral Any number between 0 and 1, exclusive.
Default is 0.95.
When choosing a new reversal list, the probability the next reversal is a neutral one. See the section on finding new reversal lists for more details.
use-mcmcmc true or false.
Default is false.
Whether or not to use Metropolis-coupled Markov chain Monte Carlo.
chains Any integer between 2 and 20, inclusive.
Default is 4.
The number of chains to use in MCMCMC.
swap-frequency 1 ≤ number.
Default is 1.
The frequency with which to swap trees between two chains in MCMCMC.
temperature Any positive number.
Default is 0.005.
Parameter used to determine the temperature of the chains in MCMCMC. The actual temperature used is 1/(1+i*temperature)
check-trees true or false.
Default is false.
Whether or not to perform tree consistency checks. This is useful for debugging new code. Checks are done after each update algorithm and are done again if the proposed tree is rejected.
debug true or false.
Default is false.
Whether or not to turning on debugging code at cycle debug-cycle. Note that the distribution of BADGER should not contain any debugging code that can be turned on, so all this does is to set the global variable debug to true at the proper cycle.
debug-cycle Any non-negative integer.
Default is 0.
The cycle at which to turn on debugging code (if any is present) when debug is set to true.


Should multiple lines for the same run control be read in, only the first value will be used. A warning message will be issued in this instance.

Include File

In addition to containing run controls, a run control file may also have a statement of the form:

include-file=<filename>

which causes the contents of the file <filename> to read as run controls at that point. Included files may also include other files. Be careful not to have an infinite loop of inclusions, as BADGER does not check for this.


Back to the table of contents.


This page was most recently updated on June 29, 2004.

badger@badger.duq.edu