* Decision Tree
* Neural Network
4.1 Decision Tree
The Decision Tree procedure creates a tree-based classification model. It classifies cases into groups or predicts values of a dependent (target) variable based on values of independent (predictor) variables. The procedure provides validation tools for exploratory and confirmatory classification analysis.
The procedure can be used for:
Segmentation Identify persons who are likely to be members of a particular group.
Stratification Assign cases into one of several categories, such as high-, medium-, and low-risk groups.
Prediction Create rules and use them to predict future events, such as the likelihood that someone will default on a loan or the potential resale value of a vehicle or home.
Data reduction and variable screening Select a useful subset of predictors from a large set of variables for use in building a formal parametric model.
Interaction identification Identify relationships that pertain only to specific subgroups and specify these in a formal parametric model. Category merging and discrediting continuous variables. Recode group predictor categories and continuous variables with minimal loss of information.
Example A bank wants to categorize credit applicants according to whether or not they represent a reasonable credit risk. Based on various factors, including the known credit ratings of past customers, you can build a model to predict if future customers are likely to default on their loans.
A tree-based analysis provides some attractive features:
It allows you to identify homogeneous groups with high or low risk. It makes it easy to construct rules for making predictions about individual cases.
4.1.1 Data Considerations
Data The dependent and independent variables can be:
Nominal A variable can be treated as nominal when its values represent categories with no intrinsic ranking (for example, the department of the company in which an employee works). Examples of nominal variables include region, zip code, and religious affiliation.
Ordinal a variable can be treated as ordinal when its values represent categories with some intrinsic ranking (for example, levels of service satisfaction from highly dissatisfied to highly satisfied). Examples of ordinal variables include attitude scores representing degree of satisfaction or confidence and preference rating scores.
Scale a variable can be treated as scale when its values represent ordered categories with a meaningful metric, so that distance comparisons between values are appropriate. Examples of scale variables include age in years and income in thousands of dollars.
4.2 Neural Network
4.2.1 Introduction to Neural Network
Neural networks are the preferred tool for many predictive data mining applications because of their power, flexibility, and ease of use. Predictive neural networks are particularly useful in applications where the underlying process is complex, such as:
Forecasting consumer demand to streamline production and delivery costs. Predicting the probability of response to direct mail marketing to determine which households on a mailing list should be sent an offer. Scoring an applicant to determine the risk of extending credit to the applicant. Detecting fraudulent transactions in an insurance claims database
4.2.2 What is Neural Network?
The term neural network applies to a loosely related family of models, characterized by a large parameter space and flexible structure, descending from studies of brain functioning. As the family grew, most of the new models were designed for non biological applications, though much of the associated terminology reflects its origin.
A neural network is a massively parallel distributed processor that has a natural propensity for storing experiential knowledge and making it available for use. It resembles the brain in two respects: Knowledge...