Regression and Classification Tree Workshop

Date12-Apr-2010 ~ 14-Apr-2010
SummaryRegression (continuous data) and Classification Trees (ordinal data) are types of decision tree learning allowing for classifying patterns in complex datasets and are becoming increasingly popular method in evolutionary and ecological research, e.g. extinction risk and body size evolution. The ultimate goal of these approaches is to create a model that predicts the value of a target variable based on several input variables. These approaches differ from traditionally multivariate analyses and clustering. A tree functions in an hierarchical arrangement; data flowing "down" a tree encounter one decision at a time until a terminal node is reached. A particular variable, and only one at a time, enters the calculation only when it is required at a particular decision node. In contrast, in multivariate analyses and clustering all critical variables are input, often yielding complex and uninterpretable results.“Decision trees are popular because they represent information in a way that is intuitive and easy to visualize, and have several other advantageous properties. Preparation of candidate predictors is simplified because predictor variables can be of any type (numeric, binary, categorical, etc.), model outcomes are unaffected by monotone transformations and differing scales of measurement among predictors, and irrelevant predictors are seldom selected. Trees are insensitive to outliers, and can accommodate missing data in predictor variable” Elith et al. (2008, J. of Animal Ecology. The a 3-day workshop will be held at NESCent in the spring by Richard Cutler (Utah State University). The course will provide both theory and applied analytical/software training. The end result is that participants should be provided with a toolkit to utilize regression/classification trees in their own research.