Statistics - Final Project

Table of contents

1.0 Introduction3

1.1 The aim of the study3

2.0 Methodology3

2.1 Correlation4

2.2 Independent Samples Test4

2.3 Two Way Anova4

2.4 One Sample T-Test4

2.5 Regression5

2.6 Histogram5

3.0 Data Analysis5

3.1 Question 25

3.2 Question 36

3.3 Question 47

3.4 Question 58

3.5 Question 68

3.6 Question 710

4.0 Conclusions11

5.0 References12

1.0 Introduction

A company with factories located in New York and Michigan produce yarn, using two different types of machines; JC980 and VH80. Each factory uses only one machine type. The sample size is at 1 391 factories. The data are collected from different days and from different factories from both cities. The variables are as follows. • City: The city where the factory is located (New York = 1, Michigan = 0). • Machine: Type of the machine used in the factory to produce yarn (JC980 = 0, VH80 = 1). • Worker: Number of workers in the factory.

• Product: Amount of the yarn production of the factory (Kg). • Resource: Amount of consumed resources to produce the yarn (Kg). • Detective: Amount of detective (damaged) production (Kg). • Cost: General cost (USD $).

• Sale: Sales amount (USD $).

1.1 The aim of the study

The purpose of the study is to analyze the data. I.e. draw conclusions about the variables and find out how to deal with the problems. One of the problems I’m going to look at is whether the machine types affect the amount of damaged yarn. Another is whether the machine type, the city or the interaction between the city and the machine type used in the factory affect the amount of produced yarn.

2.0 Methodology

To analyze the data I’ve been given, I have to use different methods and tests through the SPSS program. These following tests will be used throughout the assignment;

2.1 Correlation

A statistic that quantifies the linear relation between two variables. The coefficient of correlation is always between -1 and +1. +1 means that it has a positive relation, and -1 means that it has a negative relation. 0 is equal to no relation (Nolan 2008).

If the correlation is significant or not you have to use the following hypothesis: Reject H0 if p-value((.

H0: if α ≤ p-value: There is no significant relation between the variables. H1: if α > p-value: There is a significant relation between the variables.

2.2 Independent Samples Test

Is a method that is used to compare two means for a between-group design, a situation which each participant is assigned to only one condition. Here I am going to be looking at if the machine type has any effect on damaged yarn and to see the means of the machines.

H0: There is no significant difference between the means.

H1: There is a significant difference between the means.

2.3 Two Way Anova

Is a hypothesis test that includes two nominal independent variables, regardless of their numbers of levels, and an interval dependent variable.

H0: There are no significant interactions between the two variables. H1: There are significant interactions between the two variables

2.4 One Sample T-Test

Is a test we use when we have one sample to compare to a known population. This method will be used for question Q5, where we would like to find out if the managers at the different factories have followed their order of taking 45 workers on average each day. H0: The mean of the workers is 45.

H1: The mean of the workers is not 45.

2.5 Regression

Is a method for modelling the relationship between variables. In this model we are going to look at how the different variables affect profit, if they are significant or not. Profit will be the dependent variable, while the others will be independent.

2.6 Histogram

Will be used to depict interval data with the values of the variables on the x-axis and the frequencies on the y-axis.

3.0 Data Analysis

3.1 Question 2...