data mining hw 3

Introduction to Data Mining
Summer, 2012
Homework 3
Due Monday June.11, 11:59pm
May 22, 2012

In homework 3, you are asked to compare four methods on three diﬀerent data sets. The four methods are:

• Indicator Response Matrix
Linear Regression to the Indicator Response Matrix. You need to implement the ridge regression and tune the regularization parameter.
The material of this algorithm can be found in Page 103 to Page 106 in the book ”The Elements of Statistical Learning”
(http://www-stat.stanford.edu/~tibs/ElemStatLearn/).
• Na¨ Bayes ive You need to try Naive Bayes without smoothing and use smoothing.
• k -Nearest Neighbor for kNN, k is a parameter. You need to report two result, k =1 and k =p. you can choose an appropriate p for diﬀerent datasets.
• Support Vector Machine
Use both LibSVM (http://www.csie.ntu.edu.tw/~cjlin/libsvm/) and LibLinear (http://www.csie.ntu.edu.tw/~cjlin/liblinear/)
Use LibSVM with linear kernel and Gaussian Kernel (tune the parameters)
LibLinear is always linear, you need to compare the diﬀerent speed of
LibSVM and LibLinear.

The test datasets are as follow:
1

• ORL database
Ten diﬀerent images of each of 40 distinct subjects. For some subjects, the images were taken at diﬀerent times, varying the lighting, facial expressions (open / closed eyes, smiling / not smiling) and facial details (glasses / no glasses). All the images were taken against a dark homogeneous background with the subjects in an upright, frontal position (with tolerance for some side movement).
A random subset with 7 images per individual was taken with labels to form the training set, and the rest of the database was considered to be the test set.
You will be given ORL train.mat and ORL test.mat.
• USPS database
The USPS handwritten digit database. We provide here a popular subset contains 9298 16x16 handwritten digit images in total, which is then split into 7291 training images and 2007 test images.
You will be given

data mining hw 3

You May Also Find These Documents Helpful

Ashurnasipal II: Human-Headed Winged Lion

Ashurnasipal II: Human-Headed Winged Lion

Exercise3statistics

Exercise3statistics

PSYCH 540 Week 6 Learning Team Data Ana

PSYCH 540 Week 6 Learning Team Data Ana

UPOX Data Analytics Paper Week 5 11

UPOX Data Analytics Paper Week 5 11

HW 2

HW 2

1.1 Expected Pattern of Children's Development from Birth to 19 Years

1.1 Expected Pattern of Children's Development from Birth to 19 Years

Bsbwor501 Final Exam

Bsbwor501 Final Exam

Econ450 Syllabus.

Econ450 Syllabus.

Cool people

Cool people

Unit 19 P1

Unit 19 P1

Unit 4 Assessment

Unit 4 Assessment

Frederick Douglass Narrative Vs. Uncle Tom's Cabin

Frederick Douglass Narrative Vs. Uncle Tom's Cabin

how bias influences critical thinking

how bias influences critical thinking

Care Assistant

Care Assistant

Web and Data Mining Introduction

Web and Data Mining Introduction

Related Topics