While watching National Hockey League (NHL) games, I often heard the play-by-play announcer mention at the start of the third and final period how it would be tough for a team to come back from a one goal deficit. This led me to wonder just how difficult it was mathematically, and how much previous periods affected the final one. In this project, I will investigate whether the scores at the end of the first period affect the final score of NHL games. I will gather the scores of 200 hockey games between 2005-2008 from the nhl.com website. I chose these years because the type of hockey before and after the new Collective Bargaining Agreement is different in terms of goals scored per game, with more goals scored per game on average before than after. After gathering this data, I will analyze and compare the data. I will make scatterplots, plotting the scores of both losing and winning teams at the end of the first period on the Y-axis and the final score on the X-axis. This will let me see visually whether previous scores affect the amount of goals scored in the third period. I will also use Pearson’s correlation coefficient to see whether there appears to be a correlation between the score at the end of the first period and the final score. I will also try to describe the strength of association by using the coefficient of determination. I will then create a bar graph showing teams who won at the end of the first period and won at the end of the third and teams who lost at the end of the first period and won at the end of the third. Using information from the bar graph, I will use the chi-squared test of independence to see whether the score at the end of the first period and the score at the end of the third period are independent.
1. For the dates of the 200 games, I placed slips of paper into a three boxes. In one box were four slips of paper, one for each year between 2005-2008, in another box were 31 slips with the numbers 1-31, and in the final box were 9 slips with the months January-June, October-December
2. Took one slip of paper from each hat and noted the scores of all games played that day. I put the slips of paper back into the box after I noted the scores
3. Repeated this process until I reached 200 games
4. Made two scatterplots, one of the score of the winning team at the end of the first period and the score at the end of the third, and one of the score of the losing team at the end of the first period and at the end of the third
5. Found the line of best fit
6. Used Pearson’s correlation coefficient on the numbers used in the scatterplots
7. Used the coefficient of determination on the numbers used in step 5
8. Made bar graph of teams who won at the end of the first period and won at the end of the third versus teams who lost at the end of the first period and won at the end of the third
9. Concluded whether there appears to be a correlation between the score at the end of the first and the score at the end of the third
Uh, I had 100+ trials, I put the number into tables, you guys don't need to see this.
After collecting the data, two scatter plots were created using Microsoft Excel, one of the scores of Team A, the winning team at the end of the first period, and one of the scores of Team B, the losing team at the end of the third period.
[GRAPH] [GRAPH...if you really want to see them, send me an email/leave a comment or something and I can send it over]
To make this graph, I placed the data into Microsoft Excel, used XY (scatter) using the Chart Wizard. I made sure the maximums, minimums and scales were the same for the two graphs so a better comparison could be made between them. I next added a linear trend line. Under the Add Trendline Option, I elected to display the equation on the chart. --> NOTE: I don't think you're supposed to use Microsoft Excel, but I had lost my calculator, I was...