Decision

Lab 1: Decision Trees and Decision Rules

Evgueni N. Smirnov smirnov@cs.unimaas.nl August 21, 2010

1. Introduction Given a data-mining problem, you need to have data that represent the problem, models that are suitable for the data, and of course a data-mining environment that contains the algorithms capable of learning these models. In this lab you will study two well-known classification problems. You will try to find classification models for these problems using decision trees and decision rules. The algorithms to learn these models are given in Weka, a data-mining environment that accompanies our course. You will study the explorer part of Weka to learn how to call decision-tree and decision-rule algorithms, how to evaluate the accuracy of the learned models, and how to use reduced error pruning.

2. Concept-Learning Problems In this lab you are expected to build classification models for two classification problems: • Labor-negotiation problem; • Soybean classification problem.

The data files for all the two problems are provided in the directory:

http://www.unimaas.nl/datamining/UCI/datasets-UCI.zip

3. Environment As stated above to build the desired classification models you will use Weka. Weka is a data-mining environment that contains a collection of machine-learning algorithms for solving real-world data-mining problems. The algorithms can either be applied directly or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes. Weka is open source software issued under the GNU General Public License.

4. Algorithms To build the classifiers you will use four learning algorithms provided in Weka: 1. zeroR is a majority/average predictor. It assigns to each instance the classification of the

Decision

You May Also Find These Documents Helpful

Scor eStore.com

Scor eStore.com

You Decide

You Decide

decision

decision

Data Mining-East West Airlines

Data Mining-East West Airlines

Data Mining The Mushroom Database

Data Mining The Mushroom Database

Final Submission BI Assignment

Final Submission BI Assignment

Cloud Burst

Cloud Burst

How to Increase Retail Sales

How to Increase Retail Sales

Establishing a Center of Excellence for Data Mining

Establishing a Center of Excellence for Data Mining

decision

decision

Automatic Emotion Recognition from Speech Using Reduced Feature Set & Different Classifiers

Automatic Emotion Recognition from Speech Using Reduced Feature Set & Different Classifiers

Bayseian Classifier Implementation

Bayseian Classifier Implementation

Assgn

Assgn

Steps Involved in Processing of Data

Steps Involved in Processing of Data

Thesis Proposal for Ncae

Thesis Proposal for Ncae

Related Topics