Preview

Data Mining - Chapter 2 questions

Good Essays
Open Document
Open Document
362 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Data Mining - Chapter 2 questions
2.1 Assuming that data mining techniques are to be used in the following cases, identify whether the task required is supervised or unsupervised learning.
a. Supervised-Deciding whether to issue a loan to an applicant based on demographic and financial data (with reference to a database of similar data on prior customers).
b. Unsupervised-In an online bookstore, making recommendations to customers concerning additional items to buy based on the buying patterns in prior transactions.
c. Supervised-Identifying a network data packet as dangerous (virus, hacker attack) based on comparison to other packets whose threat status is known.
d. Unsupervised-Identifying segments of similar customers.
e. Supervised-Predicting whether a company will go bankrupt based on comparing its financial data to those of similar bankrupt and non-bankrupt firms.
f. Unsupervised-Estimating the repair time required for an aircraft based on a trouble ticket.
g. Supervised-Automated sorting of mail by zip code scanning.
h. Unsupervised-Printing of custom discount coupons at the conclusion of a grocery store checkout based on what you just bought and what others have bought previously.

2.3 Consider the sample from a database of credit applicants in Figure 2.13. Comment on the likelihood that it was sampled randomly, and whether it is likely to be a useful sample. I don’t think that the sample was random because records are taken from 8th person. If the sample were to be random it would vary more. I don’t think that the sample would be useful either because of the type of variables that are being used.
2.5 Using the concept of overfitting, explain why when a model is fit to training data, zero error with those data is not necessarily good.
It’s not good because when looking at models you want to see the relationship between the data if there are zero error in the data then the information you get is skewed and may not be a true reflection.
2.7 A dataset has 1000 records and 50

You May Also Find These Documents Helpful

  • Satisfactory Essays

    Lab 9

    • 1001 Words
    • 3 Pages

    Because if the information is not tested properly during the testing phase then its possibility that the information is not skewed. And also if the environment is not safe then there can be some compliance issues.…

    • 1001 Words
    • 3 Pages
    Satisfactory Essays
  • Good Essays

    Ten years later Kathy wants to implement a frequent shoppers program not only to reward her current loyal customer base but also to draw in new clientele. Installing a business system for this initiative would allow Kudler the ability to track purchasing patterns of individual customer's. This would give the company a better understanding of what products are sellers and what products do not move or sell as fast. Alternatively this would give customers the opportunity to receive points from their past purchases geared toward rewards products.…

    • 907 Words
    • 4 Pages
    Good Essays
  • Powerful Essays

    Math533 Part a

    • 1256 Words
    • 6 Pages

    AJ Davis is a department store chain, which has many credit customers and wants to find out more information about these customers. The total sample set of 50 credit customers is selected with data collected.…

    • 1256 Words
    • 6 Pages
    Powerful Essays
  • Powerful Essays

    Coupon Executive Summary

    • 4225 Words
    • 17 Pages

    Thegrocerygame.com provides, for a monthly fee, a website users can log into to view the current weekly advertised prices for a grocery store/retailer based on the customers zip code. While this provides an excellent interface to valued information, it is not portable like the product and service the Coupon eCompanion will provide. In addition, thegrocerygame.com and centsoff.com both offers its subscribers access to its list of inexpensive items and provides them with manufacturers' coupons. One key advantage with this product offering and model is these services are competitively priced. The average cost for this web-based service is around $5 per…

    • 4225 Words
    • 17 Pages
    Powerful Essays
  • Powerful Essays

    Before you begin your analysis you are required to take a random sample of 140 from…

    • 1201 Words
    • 5 Pages
    Powerful Essays
  • Satisfactory Essays

    Our new sample consists of a random selection of representatives, from ten randomly chosen states. We think that selecting individual at random will ensure that every representative in the customer service population will have an equal opportunity of being selected for the sample. Overall, acquiring a genuine random sample will eliminate any possibilities of bias conclusions.…

    • 425 Words
    • 2 Pages
    Satisfactory Essays
  • Satisfactory Essays

    NT1210 Labs 3.1-3.4

    • 1882 Words
    • 9 Pages

    Give another example of a model that is used to visualize something that is difficult to observe or perceive. How does the model make it easier to understand?…

    • 1882 Words
    • 9 Pages
    Satisfactory Essays
  • Powerful Essays

    McGowen, A. (2009, July-August). A Customized Approach. Retail Merchandiser, 49(4), 64-67. Retrieved from http://University of Phoenix Library…

    • 1910 Words
    • 8 Pages
    Powerful Essays
  • Powerful Essays

    AJ DAVIS is a department store chain, which has many credit customers. A sample of 50 credit customers is selected with data collected on location, income, credit balance, number of people and years lived in the house…

    • 1471 Words
    • 6 Pages
    Powerful Essays
  • Satisfactory Essays

    Does the population (i.e., sampling frame) from which the sample was taken represent all of the appropriate people?…

    • 332 Words
    • 2 Pages
    Satisfactory Essays
  • Better Essays

    Cereal Aisle Analysis

    • 1249 Words
    • 5 Pages

    Analyze the behaviors you observed to determine how consumers progressed through the consumer behavior process while in different aisles.…

    • 1249 Words
    • 5 Pages
    Better Essays
  • Good Essays

    2.1.9 Ap Psychology

    • 480 Words
    • 2 Pages

    2. The sample population is 100 people who suffer from extreme headaches living in Ohio and Missouri. The mean age for the research was 44 and 91 of them were women. I’m not sure if the sample was random, but I do not believe so because 91% of the population was women.…

    • 480 Words
    • 2 Pages
    Good Essays
  • Satisfactory Essays

    One problem would be where to enter the data. Another problem would be the accuracy of the data since some students may not know their exact height.…

    • 394 Words
    • 2 Pages
    Satisfactory Essays
  • Powerful Essays

    Study Guide

    • 3863 Words
    • 16 Pages

    4. In a questionnaire, respondents are asked to mark their gender as male or female. Gender is an example of the…

    • 3863 Words
    • 16 Pages
    Powerful Essays
  • Good Essays

    One of the biggest errors is noise. Noise in a simple definition is unwanted things that…

    • 366 Words
    • 2 Pages
    Good Essays

Related Topics