An introduction to computational methods and tools for analyzing evolutionary relationships
Karen Dowell Math 500 Fall 2008
Molecular phylogenetics applies a combination of molecular and statistical techniques to infer evolutionary relationships among organisms or genes. This review paper provides a general introduction to phylogenetics and phylogenetic trees, describes some of the most common computational methods used to infer phylogenetic information from molecular data, and provides an overview of some of the many different online tools available for phylogenetic analysis. In addition, several phylogenetic case studies are summarized to illustrate how researchers in different biological disciplines are applying molecular phylogenetics in their work.
Introduction to Molecular Phylogenetics
The similarity of biological functions and molecular mechanisms in living organisms strongly suggests that species descended from a common ancestor. Molecular phylogenetics uses the structure and function of molecules and how they change over time to infer these evolutionary relationships. This branch of study emerged in the early 20th century but didn’t begin in earnest until the 1960s, with the advent of protein sequencing, PCR, electrophoresis, and other molecular biology techniques. Over the past 30 years, as computers have become more powerful and more generally accessible, and computer algorithms more sophisticated, researchers have been able to tackle the immensely complicated stochastic and probabilistic problems that define evolution at the molecular level more effectively. Within past decade, this field has been further reenergized and redefined as whole genome sequencing for complex organisms has become faster and less expensive. As mounds of genomic data becomes publically available, molecular phylogenetics is continuing to grow and find new applications. [4, 10, 17, 20, 22] The primary objective of molecular phylogenetic studies is to recover the order of evolutionary events and represent them in evolutionary trees that graphically depict relationships among species or genes over time. This is an extremely complex process, further complicated by the fact that there is no one right way to approach all phylogenetic problems. Phylogenetic data sets can consist of hundreds of different species, each of which may have varying mutation rates and patterns that influence evolutionary change. Consequently, there are numerous different evolutionary models and stochastic methods available. The optimal methods for a phylogenetic analysis depend on the nature of the study and data used. [5, 19, 20] Molecular Evolution: Beyond Darwin Evolution is a process by which the traits of a population change from one generation to another. In On the Origin of Species by Means of Natural Selection, Darwin proposed that, given overwhelming evidence from his extensive comparative analysis of living specimens and fossils, all living organisms descended from a common ancestor. The book’s only illustration (see Figure 1) is a tree-like structure that suggests how slow and successive modifications could lead to the extreme variations seen in species today. [11, 27]
Figure 1. Evolution Defined Graphically. The sole illustration in Darwin’s Origin of the Species uses a tree-like structure to describe evolution. This drawing shows ancestors at the limbs and branches of the tree, more recent ancestors at its twigs, and contemporary organisms at its buds. 
Darwin’s theory of evolution is based on three underlying principles: variation in traits exist among individuals within a population, these variations can be passed from one generation to the next via inheritance, and that some forms of inherited traits provide individuals a higher chance of survival and reproduction than others.  Although Darwin...