BADGER
Bayesian Analysis to Describe Genomic Evolution by Rearrangement
Version 1.02 beta, June 11, 2004
Copyright © 2004 by Bret Larget & Don Simon
Our model of genome rearrangement has the following characteristics.
- Tree topology is uniformly distributed on a set of (possibly constrained) unrooted tree topologies.
- Each edge draws its length, which represents the expected number of gene inversions,
independently from a gamma distribution for which the two hyper-parameters
mu (the mean)
and psi (the variance/mean)
can either be selected or estimated from the neighbor-joining tree.
- Given the edge lengths,
the number of realized inversions on each edge are mutually independent and have Poisson distributions
with the respective means.
The gamma prior is conjugate for the Poisson mean,
and the unconditional distribution on the number of inversions per edge is negative binomial.
- Given a realized number of inversions per edge,
the times are distributed independently and uniformly at random.
- All inversions are equally likely to be any possible inversion
and are mutually independent.
- An arbitrary labeling of genes may be assigned to the genome arrangement
at any node of the tree.
Arrangements at each other node on the tree are then determined by the complete
history of realized inversions.
The expression for the unnormalized posterior of a complete history
includes a discrete component (tree topology, gene inversion counts,
and an ordered list of specific gene inversions on each edge)
as well as a continuous component (edge lengths and times of inversions).
The unnormalized posterior for the discrete component
may be computed analytically by integrating out the continuous component.
The discrete component is the state space of the Markov chain for MCMC computation.
A more thorough description of the model is here.
[PDF]
Back to the table of contents.
This page was most recently updated on May 25, 2004.
badger@badger.duq.edu