Postdoctoral Fellow
The rapid growth of available sequence data provides unprecedented opportunities for building large, well-supported phylogenies. These comprehensive phylogenies can be an invaluable resource for comparative biology or examining macroevolutionary patterns and processes. Yet, the published phylogenetic trees represent a small fraction of available sequence data and taxonomic coverage in GenBank. The goal of this study is to a build database that represents a clade-based organization of phylogenetic information for land plants in GenBank. This will be accomplished by organizing the DNA sequence data from GenBank for land plants into alignments of all potentially phylogenetically informative clusters of homologous sequences. These clusters will be filtered to remove paralogous sequences, and then the putative clusters of orthologs will be combined into supermatrices in order to build the largest possible phylogenetic trees of land plants. The database will allow comparative biologists to easily access benchmark sequence alignments for all phylogenetically informative genes available for taxa in a clade as well as the most comprehensive phylogenetic trees available for a clade. The sequence and tree databases will be designed so that comparative biologists can easily integrate their own data sets of morphological characters, natural history, or geographic distributions with the available sequence data from GenBank or large phylogenetic trees for large-scale macroevolutionary analyses.
A phylogenetic database for comparative biology in land plants
PI(s): | Gordon Burleigh |
Start Date: | 1-Nov-2006 |
End Date: | 15-Aug-2008 |
Keywords: | phylogenetics, database, comparative methods |

Related products
Software and Datasets- Wehe, A., M.S. Bansal, J.G. Burleigh, and O. Eulenstein. 2008. Dupertree: a program for large-scale gene tree parsimony phylogenetic analyses. Dupertree describes a new program for inferring phylogenies by gene tree parsimony. It is also publicly available. Other software used to build supertrees and identify ancient large-scale duplication events in the above manuscripts will eventually be released to the public.
- Chen, D., J.G. Burleigh, M.S. Bansal, and D. Fernandez-Baca. 2008. PhyloFinder: an intelligent search engine for phylogenetic tree databases. BMC Evol. Biology 8:90. describes a new software tool for accessing data from tree databases. The software home page at http://pilin.cs.iastate.edu/phylofinder/ has become unavailable, and no other location is known.
- Genome-scale phylogenetics: inferring the plant tree of life from 18,896 discordant gene trees Burleigh, J.G., M.S. Bansal, O. Eulenstein, S. Hartmann, A. Wehe, and T.J. Vision (2011). Genome-scale phylogenetics: inferring the plant tree of life from 18,896 discordant gene trees. Systematic Biology, volume 60, issue 2, pp. 117-125.
- Locating Multiple Gene Duplications Through Reconciled Trees Burleigh, J.G., M.S. Bansa, A. Wehe, and O. Eulenstein (2008). Locating Multiple Gene Duplications Through Reconciled Trees. Research In Computational Molecular Biology, Proceedings. Lecture Notes In Bioinformatics 4955: 273-284.
- DupTree: a program for large-scale gene tree parsimony phylogenetic analyses Wehe, A., M.S. Bansal, J.G. Burleigh, and O. Eulenstein (2008). DupTree: a program for large-scale gene tree parsimony phylogenetic analyses. Bioinformatics 24: 1540-1541.
- Inferring phylogenies with incomplete data sets: a 5-gene, 567-taxon analysis of angiosperms Burleigh, J.G., K.W. Hilu, and D.S. Soltis (2008). Inferring phylogenies with incomplete data sets: a 5-gene, 567-taxon analysis of angiosperms. BMC Evol. Biol. 9:61.
- PhyloFinder: an intelligent search engine for phylogenetic tree databases Chen, D., J.G. Burleigh, M.S. Bansal, and D. Fernandez-Baca (2008). PhyloFinder: an intelligent search engine for phylogenetic tree databases. BMC Evol. Biology 8: 90.
- Phylogenetic signal in matK vs. trnK: a case study Hilu, K., C. Black, D. Diouf, and J.G. Burleigh (2008). Phylogenetic signal in matK vs. trnK: a case study. Molecular Phylogenetics and Evolution 48(3): 1120-1130.
- Triplet supertree heuristics for the tree of life Lin, H.T., J.G. Burleigh, and O. Eulenstein (2008). Triplet supertree heuristics for the tree of life. BMC Bioinformatics. 10(Suppl 1): S8.
- Spectral Partitioning Of Phylogenetic Data Sets Based On Compatibility Chen, D.H., J.G. Burleigh, and D. Fernandez-Baca (2007). Spectral Partitioning Of Phylogenetic Data Sets Based On Compatibility. Systematic Biology 56(4): 623-632.