Mathematics of evolution and phylogeny /
"This book considers evolution at different scales. The focus is on the mathematical and computational tools and concepts, which form an essential basis of evolutionary studies, indicate their limitations, and give them orientation"--Provided by publisher
Clasificación: | Libro Electrónico |
---|---|
Otros Autores: | |
Formato: | Electrónico eBook |
Idioma: | Inglés |
Publicado: |
Oxford ; New York :
Oxford University Press,
2005.
|
Temas: | |
Acceso en línea: | Texto completo |
Tabla de Contenidos:
- List of Contributors
- 1 The minimum evolution distance-based approach of phylogenetic inference
- 1.1 Introduction
- 1.2 Tree metrics
- 1.2.1 Notation and basics
- 1.2.2 Three-point and four-point conditions
- 1.2.3 Linear decomposition into split metrics
- 1.2.4 Topological matrices
- 1.2.5 Unweighted and balanced averages
- 1.2.6 Alternate balanced basis for tree metrics
- 1.2.7 Tree metric inference in phylogenetics
- 1.3 Edge and tree length estimation
- 1.3.1 The LS approach
- 1.3.2 Edge length formulae
- 1.3.3 Tree length formulae
- 1.3.4 The positivity constraint
- 1.3.5 The balanced scheme of Pauplin
- 1.3.6 Semple and Steel combinatorial interpretation
- 1.3.7 BME: a WLS interpretation
- 1.4 The agglomerative approach
- 1.4.1 UPGMA and WPGMA
- 1.4.2 NJ as a balanced minimum evolution algorithm
- 1.4.3 Other agglomerative algorithms
- 1.5 Iterative topology searching and tree building
- 1.5.1 Topology transformations.; 1.5.2 A fast algorithm for NNIs with OLS
- 1.5.3 A fast algorithm for NNIs with BME
- 1.5.4 Iterative tree building with OLS
- 1.5.5 From OLS to BME
- 1.6 Statistical consistency
- 1.6.1 Positive results
- 1.6.2 Negative results
- 1.6.3 Atteson's safety radius analysis
- 1.7 Discussion
- Acknowledgements
- 2 Likelihood calculation in molecular phylogenetics
- 2.1 Introduction
- 2.2 Markov models of sequence evolution
- 2.2.1 Independence of sites
- 2.2.2 Setting up the basic model
- 2.2.3 Stationary distribution
- 2.2.4 Time reversibility
- 2.2.5 Rate of mutation
- 2.2.6 Probability of sequence evolution on a tree
- 2.3 Likelihood calculation: the basic algorithm
- 2.4 Likelihood calculation: improved models
- 2.4.1 Choosing the rate matrix
- 2.4.2 Among site rate variation
- 2.4.3 Site-specific rate variation
- 2.4.4 Correlated evolution between sites
- 2.5 Optimizing parameters
- 2.5.1 Optimizing continuous parameters
- 2.5.2 Searching for the optimal tree.; 2.5.3 Alternative search strategies
- 2.6 Consistency of the likelihood approach
- 2.6.1 Statistical consistency
- 2.6.2 Identifiability of the phylogenetic models
- 2.6.3 Coping with errors in the model
- 2.7 Likelihood ratio tests
- 2.7.1 When to use the asymptotic x2 distribution
- 2.7.2 Testing a subset of real parameters
- 2.7.3 Testing parameters with boundary conditions
- 2.7.4 Testing trees
- 2.8 Concluding remarks
- Acknowledgements
- 3 Bayesian inference in molecular phylogenetics
- 3.1 The likelihood function and maximum likelihood estimates
- 3.2 The Bayesian paradigm
- 3.3 Prior
- 3.4 Markov chain Monte Carlo
- 3.4.1 Metropolis-Hastings algorithm
- 3.4.2 Single-component Metropolis-Hastings algorithm
- 3.4.3 Gibbs sampler
- 3.4.4 Metropolis-coupled MCMC
- 3.5 Simple moves and their proposal ratios
- 3.5.1 Sliding window using uniform proposal
- 3.5.2 Sliding window using normally distributed proposal.; 3.5.3 Sliding window using normal proposal in multidimensions
- 3.5.4 Proportional shrinking and expanding
- 3.6 Monitoring Markov chains and processing output
- 3.6.1 Diagnosing and validating MCMC algorithms
- 3.6.2 Gelman and Rubin's potential scale reduction statistic
- 3.6.3 Processing output
- 3.7 Applications to molecular phylogenetics
- 3.7.1 Estimation of phylogenies
- 3.7.2 Estimation of species divergence times
- 3.8 Conclusions and perspectives
- Acknowledgements
- 4 Statistical approach to tests involving phylogenies
- 4.1 The statistical approach to phylogenetic inference
- 4.2 Hypotheses testing
- 4.2.1 Null and alternative hypotheses
- 4.2.2 Test statistics
- 4.2.3 Significance and power
- 4.2.4 Bayesian hypothesis testing
- 4.2.5 Questions posed as function of the tree parameter
- 4.2.6 Topology of treespace
- 4.2.7 The data
- 4.2.8 Statistical paradigms
- 4.2.9 Distributions on treespace
- 4.3 Different types of tests involving phylogenies.; 4.3.1 Testing t1 versus t2
- 4.3.2 Conditional tests
- 4.3.3 Modern Bayesian hypothesis testing
- 4.3.4 Bootstrap tests
- 4.4 Non-parametric multivariate hypothesis testing
- 4.4.1 Multivariate con.dence regions
- 4.5 Conclusions: there are many open problems
- Acknowledgements
- 5 Mixture models in phylogenetic inference
- 5.1 Introduction: models of gene-sequence evolution
- 5.2 Mixture models
- 5.3 Defining mixture models
- 5.3.1 Partitioning and mixture models
- 5.3.2 Discrete-gamma model as a mixture model
- 5.3.3 Combining rate and pattern-heterogeneity
- 5.4 Digression: Bayesian phylogenetic inference
- 5.4.1 Bayesian inference of trees via MCMC
- 5.5 A mixture model combining rate and pattern-heterogeneity
- 5.5.1 Selected simulation results
- 5.6 Application of the mixture model to inferring the phylogeny of the mammals
- 5.6.1 Model testing
- 5.7 Results
- 5.7.1 How many rate matrices to include in the mixture model?; 5.7.2 Inferring the tree of mammals
- 5.7.3 Tree lengths
- 5.8 Discussion
- Acknowledgements.
- 6 Hadamard conjugation: an analytic tool for phylogenetics
- 6.1 Introduction
- 6.2 Hadamard conjugation for two sequences
- 6.2.1 Hadamard matrices-a brief introduction
- 6.3 Some symmetric models of nucleotide substitution
- 6.3.1 Kimura's 3-substitution types model
- 6.3.2 Other symmetric models
- 6.4 Hadamard conjugation-Neyman model
- 6.4.1 Neyman model on three sequences
- 6.4.2 Neyman model on four sequences
- 6.4.3 Neyman model on n + 1 sequences
- 6.5 Applications: using the Neyman model
- 6.5.1 Rate variation
- 6.5.2 Invertibility
- 6.5.3 Invariants
- 6.5.4 Closest tree
- 6.5.5 Maximum parsimony
- 6.5.6 Parsimony inconsistency, Felsenstein's example
- 6.5.7 Parsimony inconsistency, molecular clock
- 6.5.8 Maximum likelihood under the Neyman model
- 6.6 Kimura's 3-substitution types model
- 6.6.1 One edge
- 6.6.2 K3ST for n + 1 sequences.; 6.7 Other applications and perspectives
- 7 Phylogenetic networks
- 7.1 Introduction
- 7.2 Median networks
- 7.3 Visual complexity of median networks
- 7.4 Consensus networks
- 7.5 Treelikeness
- 7.6 Deriving phylogenetic networks from distances
- 7.7 Neighbour-net
- 7.8 Discussion
- Acknowledgements
- 8 Reconstructing the duplication history of tandemly repeated sequences
- 8.1 Introduction
- 8.2 Repeated sequences and duplication model
- 8.2.1 Di.erent categories of repeated sequences
- 8.2.2 Biological model and assumptions
- 8.2.3 Duplication events, duplication histories, and duplication trees
- 8.2.4 The human T-cell receptor Gamma genes
- 8.2.5 Other data sets, applicability of the model
- 8.3 Mathematical model and properties
- 8.3.1 Notation
- 8.3.2 Root position
- 8.3.3 Recursive de.nition of rooted and unrooted duplication trees
- 8.3.4 From phylogenies with ordered leaves to duplication trees.; 8.3.5 Topñdown approach and leftñright properties of rooted duplication trees
- 8.3.6 Counting duplication histories
- 8.3.7 Counting simple event duplication trees
- 8.3.8 Counting (unrestricted) duplication trees
- 8.4 Inferring duplication trees from sequence data
- 8.4.1 Preamble
- 8.4.2 Computational hardness of duplication tree inference
- 8.4.3 Distance-based inference of simple event duplication trees
- 8.4.4 A simple parsimony heuristic to infer unrestricted duplication trees
- 8.4.5 Simple distance-based heuristic to infer unrestricted duplication trees
- 8.5 Simulation comparison and prospects
- Acknowledgements
- 9 Conserved segment statistics and rearrangement inferences in comparative genomics
- 9.1 Introduction
- 9.2 Genetic (recombinational) distance
- 9.3 Gene counts
- 9.4 The inference problem
- 9.5 What can we infer from conserved segments?
- 9.6 Rearrangement algorithms
- 9.7 Loss of signal
- 9.8 From gene order to genomic sequence.; 9.8.1 The Pevzner-Tesler approach
- 9.8.2 The re-use statistic r
- 9.8.3 Simulating rearrangement inference with a block-size threshold
- 9.8.4 A model for breakpoint re-use
- 9.8.5 A measure of noise?
- 9.9 Between the blocks
- 9.9.1 Fragments
- 9.10 Conclusions
- Acknowledgements
- 10 The inversion distance problem
- 10.1 Introduction and biological background
- 10.2 De.nitions and examples
- 10.3 Anatomy of a signed permutation
- 10.3.1 Elementary intervals and cycles
- 10.3.2 E.ects of an inversion on elementary intervals and cycles
- 10.3.3 Components
- 10.3.4 Effects of an inversion on components
- 10.4 The HannenhalliñPevzner duality theorem
- 10.4.1 Sorting oriented components
- 10.4.2 Computing the inversion distance
- 10.5 Algorithms
- 10.6 Conclusion
- Glossary
- 11 Genome rearrangements with gene families
- 11.1 Introduction
- 11.2 The formal representation of the genome
- 11.3 Genome rearrangement
- 11.4 Multigene families.; 11.5 Algorithms and models
- 11.5.1 Exemplar distance
- 11.5.2 Phylogenetic analysis
- 11.6 Genome duplication
- 11.6.1 Formalizing the problem
- 11.6.2 Methodology
- 11.6.3 Analysing the yeast genome
- 11.6.4 An application on a circular genome
- 11.7 Duplication of chromosomal segments
- 11.7.1 Formalizing the problem
- 11.7.2 Recovering an ancestor of a semi-ambiguous genome
- 11.7.3 Recovering an ancestor of an ambiguous genome
- 11.7.4 Recovering the ancestral nodes of a species tree
- 11.8 Conclusion.
- 12 Reconstructing phylogenies from gene-content and gene-order data
- 12.1 Introduction: phylogenies and phylogenetic data
- 12.1.1 Phylogenies
- 12.1.2 Phylogenetic reconstruction
- 12.2 Computing with gene-order data
- 12.2.1 Genomic distances
- 12.2.2 Evolutionary models and distance corrections
- 12.2.3 Reconstructing ancestral genomes
- 12.3 Reconstruction from gene-order data
- 12.3.1 Encoding gene-order data into sequences.; 12.3.2 Direct optimization
- 12.3.3 Direct optimization with a metamethod: DCM-GRAPPA
- 12.3.4 Handling unequal gene content in reconstruction
- 12.4 Experimentation in phylogeny
- 12.4.1 How to test?
- 12.4.2 Phylogenetic considerations
- 12.5 Conclusion and open problems
- 13 Distance-based genome rearrangement phylogeny
- 13.1 Introduction
- 13.2 Whole genomes and events that change gene orders
- 13.2.1 Inversions and transpositions
- 13.2.2 Representations of genomes
- 13.2.3 Edit distances between genomes: inversion and breakpoint distances
- 13.2.4 The Nadeau-Taylor model and its generalization
- 13.3 Distance-based phylogeny reconstruction
- 13.3.1 Additive and near-additive matrices
- 13.3.2 The two steps of a distance-based method
- 13.3.3 Method of moments estimators
- 13.4 Empirically Derived Estimator
- 13.4.1 The method of moments estimator: EDE
- 13.4.2 The variance of the inversion and EDE distances.; 13.5 IEBP: "Inverting the expected breakpoint distance"
- 13.5.1 The method of moments estimator, Exact-IEBP
- 13.5.2 The method of moments estimator, Approx-IEBP
- 13.5.3 The variance of the breakpoint and IEBP distances
- 13.6 Simulation studies
- 13.6.1 Accuracy of the evolutionary distance estimators
- 13.6.2 Accuracy of NJ and Weighbor using IEBP and EDE
- 13.7 Summary
- Acknowledgements
- 14 How much can evolved characters tell us about the tree thatgenerated them?
- 14.1 Introduction
- 14.2 Preliminaries
- 14.2.1 Phylogenetic trees
- 14.2.2 Markov processes on trees
- 14.3 Information-theoretic bounds: ancestral states and deep divergences
- 14.3.1 Reconstructing deep divergences
- 14.3.2 Connection with information theory
- 14.4 Phase transitions in ancestral state and tree reconstruction
- 14.4.1 The logarithmic conjecture
- 14.4.2 Reconstructing forests
- 14.5 Processes on an unbounded state space: the random cluster model
- 14.6 Large but finite state spaces
- 14.7 Concluding comments.