Short-term Visitor

The evoText Project

PI(s): Charles Pence (University of Notre Dame (Notre Dame,IN))
Start Date: 16-Sep-2013
End Date: 15-Dec-2013
Keywords: software, database, evolutionary theory

There is currently no tool that researchers can use to investigate, in a comprehensive, objective, and quantitative way, the nature and history of the evolutionary sciences. Just as scientists are facing new problems stemming from an inundation of information – giving rise to the problem of “big data” in biology and increasing the need for theoretical synthesis – those who are attempting to understand the biological sciences must grapple with the vastness of the biological literature. The volume of journal articles published on a daily basis means that one simply cannot keep up on current research, much less analyze enough to have a rich understanding of the past.

In order to help solve the problem of understanding the nature of contemporary evolutionary biology as well as its history, the proposed project is to build a publicly accessible, web-based tool, evoText. EvoText will provide a database containing the entire corpus of evolutionary biology journal articles, which can be searched and analyzed using sophisticated text-mining tools. Using evoText, we can understand the nature and history of evolutionary biology in a comprehensive, synthetic, and systematic way and obtain answers to questions like these: How are key concepts like fitness, complexity, chance, and progress used now, and how have they changed? Can we extract information about the meanings of biological terms by observing how they are used in the literature?