Preview

SAS Regression 1

Satisfactory Essays
Open Document
Open Document
339 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
SAS Regression 1
MIS 6324
Business Intelligence

3. Classification using SAS Enterprise Miner
In this question you will analyze the JUNKMAIL dataset found in the SASHELP library. Follow the procedure we used for analyzing the HMEQ dataset. Detailed instructions for the HMEQ analysis are given in the emcs.pdf document.

You will need to create and execute the process flow diagram shown above. Further requirements for analyzing JUNKMAIL are as given below:
This data will be used to classify emails as junk mail or not. Create the data source and set the role for all variables, including the target variable appropriately.
You can use the default values for everything else when creating the Data Source
Partition the data into a 60/40 split with no data being used for Testing.
Follow the steps shown in the process diagram.
You will try out four different models as described below:
Regression: This model is the default regression model with the original data
Regression – No Model Selection: This is the default regression model after transforming the variables as described below.
Regression – Stepwise: This is the Regression model using stepwise regression and transformed data
Decision Tree: This is the default decision tree model using transformed data
Transform Variables:
Transform all variables using log value
Model Comparison: Run with Selection Statistic set to Misclassification Rate
Now answer the following questions:
1. Which model is selected as the best one by the Model Comparison Node? Regression on the original data.
2. What is the training misclassification rate for this model? What is the validation misclassification rate?
Training Misclassification rate : 0.064879
Validation Misclassification Rate : 0.077090
3. What are the first four most important variables used
Exclamation
CapAvg
Remove
HP
4. What is the

You May Also Find These Documents Helpful

  • Satisfactory Essays

    External Mail Services

    • 536 Words
    • 3 Pages

    Mail may have confidential information it may contain personal/sensitive information about employees/customers. Therefore we need to maintain security and make sure that post is given to the correct person it is addressed to. Also we need to make sure that we don’t open private and confidential mail.…

    • 536 Words
    • 3 Pages
    Satisfactory Essays
  • Good Essays

    Choose one of the variables in your dataset and classify it according to the levels of measurement. Explain how you know.…

    • 332 Words
    • 2 Pages
    Good Essays
  • Satisfactory Essays

    Chapter 20 lab

    • 284 Words
    • 2 Pages

    5. SpamAssassin is installed on your mail server, with the threshold set to an unusually low value of 3, resulting in a lot of false positives. What rule could you give to your mail client to allow it to identify spam with a score of 5 or above?…

    • 284 Words
    • 2 Pages
    Satisfactory Essays
  • Satisfactory Essays

    straight to the point and with all pertinent information. Base on the opening of the Email it could…

    • 907 Words
    • 4 Pages
    Satisfactory Essays
  • Satisfactory Essays

    8. Which protocol is used for a variety of functions in the e-mail server, such as resolving the numeric address of email.user@emailserver.net, and which servers are blacklisted for being sources of Unsolicited…

    • 338 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    sas homework Solutions

    • 746 Words
    • 7 Pages

    1. Download the external file HW5-States.txt from the Blackboard. Read it into SAS. It contains the Statehood Order, State Name, and Statehood Date. Finish the following tasks.…

    • 746 Words
    • 7 Pages
    Good Essays
  • Good Essays

    2) This preparation process involves looking at the characteristics of the receivers of the sender’s message…

    • 556 Words
    • 3 Pages
    Good Essays
  • Satisfactory Essays

    The equation of the ‘best fit’ line or the regression equation is SALES(Y) = 9.638 + 0.2018 CALLS(X1)…

    • 1056 Words
    • 6 Pages
    Satisfactory Essays
  • Good Essays

    Unit 210 handling mail

    • 510 Words
    • 3 Pages

    Mail may have confidential information. It may include personal information about employees or contain business secrets such as the names of customers and confidential information related to the business, customers and clients. Also, mails and packages have to be checked for suspicious features. So security procedures will have to be followed while dealing with mails and packages.…

    • 510 Words
    • 3 Pages
    Good Essays
  • Good Essays

    Regression Analysis Quiz

    • 682 Words
    • 3 Pages

    You are about to test the hypothesis that sales of your product will increase at a very similar rate at either a $5 drop in unit price or a $7 drop in unit price. You are involved in what type of research?…

    • 682 Words
    • 3 Pages
    Good Essays
  • Satisfactory Essays

    5. SpamAssasin is installed on your email server, with the threshold set to an unusually low value of 3, resulting in a lot of false positives. What rule could give your mail client to allow it to identify spam with a score of 5 or higher.…

    • 401 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    Regression Analysis

    • 619 Words
    • 3 Pages

    I met with Talon Peterson and administered 2 different reading test/tasks. First he completed a post test of the Renaissance Star test. This test is vocabulary and comprehension so the results are a fairly good indicator of overall reading ability. What Star won't measure is fluency which is a major factor in determining a student's ability to keep pace with their peers on reading and writing tasks.…

    • 619 Words
    • 3 Pages
    Good Essays
  • Good Essays

    a. Set up a spreadsheet that adequately captures the key drives/inputs to analyze Writer’s Edge mailing…

    • 2010 Words
    • 9 Pages
    Good Essays
  • Better Essays

    Regression Analysis

    • 1285 Words
    • 6 Pages

    This presentation on Regression Analysis will relate to a simple regression model. Initially, the regression model and the regression equation will be explored. As well, there will be a brief look into estimated regression equation. This case study that will be used involves a large Chinese Food restaurant chain.…

    • 1285 Words
    • 6 Pages
    Better Essays
  • Satisfactory Essays

    John's Case

    • 358 Words
    • 2 Pages

    There are many companies that are currently marketing e-mail monitoring services. John needs these services range from a full e-mail monitoring application to a program that only records the time at which employees pick up their e-mail. The full e-mail application program will record all of the following information.…

    • 358 Words
    • 2 Pages
    Satisfactory Essays