Preview

Data Mining Bankruptcy Case

Good Essays
Open Document
Open Document
466 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Data Mining Bankruptcy Case
-------------------------------------------------
Tzu Han Hung (Vivian) CASE 2 1. Estimated profit by random selection
Expected spending per catalog mailed = 0.053 * $103 = $5.46
Expected Gross Profit by random select= (5.46-2)*180,000 = $622,800 2. a) We applied partition to “All_data” sheet, and partition output is shown in “Data_Partition1”
b) Logistic regression output can be seen in “LR_Output1”. Target variable is “purchase”. We select every variable except sequence_number(meaningless variable), source_w(removed from one of “source” variables because it is redundant), and spending (no meaning for target variable, purchase probability).
We choose the subset with 7 coefficients, since it has Cp value of 7.4 (closer to 7) as well as the probability greater than 10%. We applied the regression model to testing and validation dataset (output is in “LR_Output2”, “LR_Testscore2”, and “LR_ValidLiftChart2”). In testcore sheet, we can see the probability output we generated for each row from test data. Below shows the regression model and scoring summary.

3. a) the data of purchaser only is in “Purchasers_only” sheet b) Partition is shown in “Data_Partition2” sheet
c) Multiple Linear regression output can be seen in “MLR_Output1”. Target variable is “spending”. We select every variable except sequence_number(meaningless variable), source_w(removed from one of “source” variables because it is redundant), and purchase(all are 1 here).
d) To select best subset, the first criteria we consider is adjusted R square, finding the point where R square value stop improving, which is around 8 coefficients. Next we check Cp value, since Cp is not approaching the number of coefficient at all until more than 20 coefficient and Cp is our second criteria, we decided to choose 8 coefficients as our regression model, so that we can keep our simple model and avoid over-fitting problem. We applied the regression model to testing and validation dataset (output is

You May Also Find These Documents Helpful

  • Satisfactory Essays

    Math 533 Part 3

    • 481 Words
    • 2 Pages

    2. Determine the equation of the "best fit" line, which describes the relationship between CREDIT BALANCE and SIZE.…

    • 481 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    b) The value of is 0.354 meaning that the regression model accounts 35.4% of the variation of the dependent variable, leaving 64.6% unexplained variation. Compared to the in part a), it has increases by 32.1% suggesting that additional of other independent variables have influences on the attendance…

    • 849 Words
    • 4 Pages
    Good Essays
  • Powerful Essays

    Bankruptcy Midterm

    • 3520 Words
    • 15 Pages

    Chapter 12 – provides for adjustment of debts of a “family farmer”, or a “family fisherman”.…

    • 3520 Words
    • 15 Pages
    Powerful Essays
  • Satisfactory Essays

    Age Gap Analysis

    • 896 Words
    • 4 Pages

    β2: An increase of living area by a hundred of square feet increases the selling price of home by 8884.48 dollars.…

    • 896 Words
    • 4 Pages
    Satisfactory Essays
  • Satisfactory Essays

    PS 8

    • 422 Words
    • 4 Pages

    Following are the regression results for the data using Excel. In this problem, you will be interpreting the regression results. (For Practice, you may want to see if you can replicate these results using the data above in Excel.) (7 Points)…

    • 422 Words
    • 4 Pages
    Satisfactory Essays
  • Satisfactory Essays

    The equation of the ‘best fit’ line or the regression equation is SALES(Y) = 9.638 + 0.2018 CALLS(X1)…

    • 1056 Words
    • 6 Pages
    Satisfactory Essays
  • Better Essays

    Bankruptcy Law - 1

    • 986 Words
    • 4 Pages

    One of the responsibilities as a paralegal is to conduct research. There is a list of basic and useful resources for practitioners and law students to utilize in researching a bankruptcy. The primary sources are Statues, Rules and Cases. The secondary Sources are Bankruptcy Treatises, Internet Sources and Research Databases also known as Bankruptcy Reporter Systems. All these resources are relevant to explain the process in which each is needed to research bankruptcy issues.…

    • 986 Words
    • 4 Pages
    Better Essays
  • Satisfactory Essays

    Soci

    • 780 Words
    • 4 Pages

    2. Find the multiple regression equation. Interpret its meaning and the meaning of its slopes and constant.…

    • 780 Words
    • 4 Pages
    Satisfactory Essays
  • Satisfactory Essays

    Week 3 Assignment 2

    • 718 Words
    • 3 Pages

    The coefficient is to determine how well the regression data fits the data. The square of R (0.832980642) is the degree of correlation between the dependable variable Y and Independent variable X.…

    • 718 Words
    • 3 Pages
    Satisfactory Essays
  • Better Essays

    2. Using Excel or other calculation software, input the data you collected in criterion one to calculate an estimated regression. Then, from the calculation provided, interpret the coefficient of determination, indicating how it will influence your decision to open the pizza business. Explain any additional variables that may improve the coefficient of…

    • 988 Words
    • 4 Pages
    Better Essays
  • Good Essays

    “The American republic will endure until the day congress discovers that it can bribe the republic with their money” as stated by Alexis de Tocqueville. We as Americans take many privileges for granted. We want so much and will do the least amount of work to get it-and our government does the same. Fraud, waste, and abuse has made this country engulf itself into self pity and has conned the government into thinking that this method is the only way for it to function. Continuous over sight will bring this country to its knees and we will be forced to succumb to the highest bidder.…

    • 664 Words
    • 3 Pages
    Good Essays
  • Better Essays

    Statistics Coursework

    • 816 Words
    • 4 Pages

    * The analysis is clear, informative, detailed and makes references to economic theory and to technical aspects of regression analysis;…

    • 816 Words
    • 4 Pages
    Better Essays
  • Satisfactory Essays

    Because of his financial success, the voter should elect Trump. Since Trump had business setbacks such as business bankruptcy, Trump has skeptics. According to the business magazine Forbes’ website, Trump has filed for business bankruptcy quadruple times in 1991, 1992, 2004, and 2009, however, Trump has a higher personal net worth currently than he did prior to his bankruptcies. As a matter of fact, according to the magazine Atlantic’s The Wire political section, in 1989, which was recorded precedent to his business bankruptcies, Trump held a net worth of three billion dollars, however, the magazine Forbes’ website states that presently his net worth is four billion dollars six years succeeding his final business bankruptcy. Despite the fact that Trump has endured through four business bankruptcies, Trump has acquired more money than he had before the bankruptcies.…

    • 179 Words
    • 1 Page
    Satisfactory Essays
  • Powerful Essays

    BUS 475 Final Exam

    • 1754 Words
    • 7 Pages

    6) Sam's Used Cars uses the specific identification method of costing inventory. During March, Sam purchased three cars for $6,000, $7,500, and $9,750, respectively. During March, two cars are sold for $9,000 each. Sam determines that at March 31, the $9,750 car is still on hand. What is Sam’s gross profit for March?…

    • 1754 Words
    • 7 Pages
    Powerful Essays
  • Good Essays

    a) Using the data in Table 1, specify a linear functional form for the demand for Combination 1…

    • 474 Words
    • 2 Pages
    Good Essays