Quantcast
Viewing all 828 articles
Browse latest View live

How to Change the Language in Minitab 17 Statistical Software

While most of us work in Minitab Statistical Software using our preferred language, some need to share Minitab project files or present the results in a different language. Others among us just want to play around with the languages because playing around with Minitab is fun!

Thankfully, Minitab offers our statistical software in eight languages, including English, French, German, Japanese, Korean, Portuguese, Simplified Chinese and Spanish. 

Changing the language in Minitab 17 is a breeze! Just follow the instructions below.

Step 1: Update Minitab to the current version

Make sure you have the newest version of Minitab 17. You can check for updates within Minitab by choosing Help> Check for Updates:

Image may be NSFW.
Clik here to view.

If updates are available, proceed with the installation. If no updates are available, then the newest version is already installed and the first step is complete. (Note: if you encounter an error message that says that the Minitab License Update Manager is not installed, you can contact technical support for help.)

This is an important first step, because the language pack that will be installed is only compatible with the newest version of the software.

Step 2: Download and install the language pack

A single language pack can be installed or you can install all the available languages. The language packs for Minitab 17 can be downloaded from Minitab.com on this page

Image may be NSFW.
Clik here to view.

Simply select one of the languages from the drop-down list, or choose "All Languages," then download, save the language pack installer to your computer, and launch the setup by double-clicking on the installer.

Step 3: Change the language within Minitab

After installing the language pack, open Minitab and choose Tools > Options. Use the Language drop-down list to select a language, and then click OK.

Image may be NSFW.
Clik here to view.

Now close and re-open Minitab, and the software will be in the selected language. Going forward, the language can be changed as needed from the Tools menu by following the same procedure: Change the language, click OK, then close and re-open Minitab.

What will change, exactly?

When language is changed (for example, to Korean, as shown below), the menus and any output generated will be in the new language:

Image may be NSFW.
Clik here to view.

It’s important to note that Minitab cannot translate output that was generated in a different language.  In other words, if I generate my graphs or other output in English, changing the language will not change the language on any of the output that was already generated. If your output needs to be in Korean, that output must be generated while the software is in Korean.

For more Minitab tips & tricks, you may want to check out the posts Minitab Tips and Tricks: Top 10 Countdown Finale and the Tips and Tricks from Minitab’s Technical Support Team.


Trouble Starting an Analysis? Graph Your Data with an Individual Value Plot

You've collected a bunch of data. It wasn't easy, but you did it. Yep, there it is, right there...just look at all those numbers, right there in neat columns and rows. Congratulations.

I hate to ask...but what are you going to do with your data?

If you're not sure precisely what to do with the data you've got, graphing it is a great way to get some valuable insight and direction. And a good graph to start with is an individual value plot, which you can create in Minitab Statistical Software by going to Graph > Individual Value Plot

How can individual value plots help me?

There are other graphs you could start with, so what makes the individual value plot such a strong contender? That fact it lets you view important data features, find miscoded values, and identify unusual cases. 

In other words, taking a look at an individual value plot can help you to choose the appropriate direction for your analysis and to avoid wasted time and frustration.

IDENTIFY INDIVIDUAL VALUES

Many people like to look at their data in boxplots, and you can learn many valuable things from those graphs. Unlike boxplots, individual value plots display all data values and may be more informative than boxplots for small amounts of data.

Image may be NSFW.
Clik here to view.
boxplot of length

The boxplots for the two variables look identical.

Image may be NSFW.
Clik here to view.
individual value plot

The individual value plot of the same data shows that there are many more values for Batch 1 than for Batch 2.

You can use individual value plots to identify possible outliers and other values of interest. Hover the cursor over any point to see its exact value and position in the worksheet.

Image may be NSFW.
Clik here to view.
clustered data distribution

Individual value plots can also clearly illustrate characteristics of the data distribution. In this graph, most values are in a cluster between 4 and 10. Minitab can jitter (randomly nudge) the points horizontally, so that one value doesn’t obscure another. You can edit the plot to turn on or turn off jitter.

MAKE GROUP COMPARISONS

Because individual value plots display all values for all groups at the same time, they are especially helpful when you compare variables, groups, and even subgroups.

Image may be NSFW.
Clik here to view.
time vs. shift plot

This plot shows the diameter of pipes from two lines over four shifts. You can see that the diameters of pipes produced by Line 1 seem to increase in variability across shifts, while the diameters of pipes from Line 2 appear more stable.

SUPPORT OTHER ANALYSES

An individual value plot is one of the built-in graphs that are available with many Minitab statistical analyses. You can easily display an individual value plot while you perform these analyses. In the analysis dialog box, simply clickGraphs and check Individual Value Plot.

Some built-in individual value plots include specific analysis information. For example, the plot that accompanies a 1-sample t-test displays the 95% confidence interval for the mean and the reference value for the null hypothesis mean. These plots give you a graphical representation of the analysis results.

Image may be NSFW.
Clik here to view.
horizontal plot

This plot accompanies a 1-sample t-test. All of the data values are between 4.5 and 5.75. The reference mean lies outside of the confidence interval, which suggests that the population mean differs from the hypothesized value.

Individual Value Plot:  A Case Study

Suppose that salad dressing is bottled by four different machines and that you want to make sure that the bottles are filled correctly to 16 ounces. You weigh 30 samples from each machine. You plan to run an ANOVA to see if the means of the samples from each machine are equal. But, first, you display an individual value plot of the samples to get a better understanding of the data.

Image may be NSFW.
Clik here to view.
data

Choose Graph > Individual Value Plot.
Under One Y, choose With Groups.
Click OK.
In Graph variables, enter Weight.
In Categorical variables for grouping, enter Machine.
Click Data View.
Under Data Display, check Interval bar and Mean symbol.
Click OK in each dialog box.

Image may be NSFW.
Clik here to view.
individual value plot of weight

The mean fill weight is about 16 ounces for Fill2, Fill3, and Fill4, with no suspicious data points. For Fill1, however, the mean appears higher, with a possible outlier at the lower end.

Before you continue with the analysis, you may want to investigate problems with the Fill1 machine.

Putting individual value plots to use

Use Minitab’s individual value plot to get a quick overview of your data before you begin your analysis—especially if you have a small data set or if you want to compare groups. The insight that you gain can help you to decide what to do next and may save you time exploring other paths.

For more information on individual value plots and other Minitab graphs, see Minitab Help.

The Longest Drive: Golf and Design of Experiments, Part 4

Image may be NSFW.
Clik here to view.
Step 3 in our DOE problem solving methodology is to determine how many times to replicate the base experiment plan. The
discussion in Part 3 ended with the conclusion that our 4 factors could best be studied using all 16 combinations of the high and low settings for each factor, a full factorial. Each golfer will perform half of the sixteen possible combinations and each golfer’s data could stand as a complete half fractional study of our four research variables.

But how many times should each golfer replicate their runs to produce a complete data set for that golfer?

The data analysis for a DOE is multiple linear regression to produce an equation defining the response(s), as a function of the experimental factors. Run replicates serve four functions in this analysis:

  1. Higher replication produces more precise regression equation coefficients.
  2. Replicates are needed to estimate the error term for statistical tests on factor effects. 
  3. Replicates allow the study of variation in our response.
  4. Replication provides replacement insurance against botched runs or measurements.

Measuring the variation at each run condition, log(standard deviation) is a common response, and allows the experimenter to study effects of the experimental factors on the response variation, in addition to the mean. This often leads to run conditions that reduce variability, which is almost a universal goal in manufacturing. Function 4 (above) just makes good sense. In fact, I always recommend at least two replicate runs so that one can be used as an outlier check against the other, as well as a replacement in case a true outlier is found.

Averaging Replicates and Variation

The first two functions shown above work together and require a power and sample size calculation to determine a reasonable estimate of the number of run replicates required to simultaneously meet both functions. Higher replications result in more precise estimates of the regression coefficients by averaging away the variation, according to the simple formula below:

                                    Image may be NSFW.
Clik here to view.

This is illustrated in the diagram below showing individual data compared to the same data, which has been grouped into samples of 4 and then averaged. The variation is reduced by a factor of sqrt(4) = 2.

Image may be NSFW.
Clik here to view.

Averaging away variation allows you to detect effects despite the noise. This is taken to an extreme in a study which showed an effect of tee height by taking 27 random golfers and having them all drive 10 balls at each of 3 tee heights. All this just to study one variable! In DOE, it is best to take the opposite approach and control as many sources of variation as possible and minimize the number of replicates.

Power and Sample Size Calculations

The probability an experiment will detect a certain size effect on the response (power) depends on 4 factors:

  1. The size of the effect being estimated.
  2. The amount of variability in the data (standard deviation).
  3. Number of replications.
  4. Alpha = probability of type I error.

Using Minitab for our calculations, we set out to learn the number of replications required. The four golfers in our experiment estimated that their drives typically vary over a range of 50 – 60 yards.  Assuming that this range of variability is about 5 – 6 standard deviations, we estimate our variability to replicate a drive is a standard deviation = 10 yards. Our calculation also requires an estimate from the golfers of the improvement in their drive distance that would result in an improved overall golf score. Their estimate was about 10 yards as well.

Finally, an effective data collection should have at least an 80% chance of detecting this ten-yard change if one of our factors does cause it. We enter this data into the power and sample size calculator in Minitab for a two-level factorial design and obtain the following results:

                                            Image may be NSFW.
Clik here to view.

 

 

 

 

 

 

 Each golfer would need to replicate the full 16 run factorial 3 times for a total of 48 runs if they expect the experiment to have a power of at least 80% or 90%, or actually about 92%. This is not an unreasonable value for one morning’s work for each golfer. On the other hand, if we planned on having the golfers only execute half of the full experiment for a total of 8 runs, our results would look like this:

                                             Image may be NSFW.
Clik here to view.

 

 

 

 

 

 

We could get away with 5 replications of the 8 run half fraction for a total of only 40 drives resulting in a power of about 87%. This is a good deal! We still have a reasonable power and 8 fewer drives per golfer.  

Based on these results, we select 5 replications of the half fraction for each golfer as a sample size large enough to detect an effect large enough to impact our golfers’ final score.

A Final Note – Avoid a Standardized Effect Size

In the last calculation, we requested that our golfers estimate the range of variability in their drive and the size of the drive distance change that would have a significant impact on their golf score. These are two important requirements (standard deviation and effect size) for the calculation of the required replications.

Dr. Russ Lenth, a professor of statistics at the University of Iowa, discourages a common practice, which assumes a standard deviation = 1 and thus makes the effect size units of standard deviations (standardized effect). In a 2001 article on the subject, he questions the value of knowing how many standard deviations of effect size we can detect if we don’t know the standard deviation. This does not allow the researcher to size the experiment so that it can detect the smallest level of change in the response that is large enough to have a business impact. This is what we are trying to do! I agree with Dr. Lenth on this point and, frankly, I am amazed at the number of times I see this not only approximate, but misleading, calculation carried out.

Your experiment is much better served by taking the time to estimate the standard deviation and determining the appropriate effect size when running these calculations. 

So now we have our sample size. In our next post we will review the calculations for incorporating blocking variables and covariates in the analysis of the final data. See you at the driving range!

Many thanks to Toftrees Golf Resort and Tussey Mountain for use of their facilities to conduct our golf experiment. 

References

Lenth, R. V. (2001). Some Practical Guidelines for Effective Sample Size Determination, The American Statistician, Vol 55, No. 3 p187 -193.

Previous Golf DOE Posts

The Longest Drive: Golf and Design of Experiments, Part 1

The Longest Drive: Golf and Design of Experiments, Part 2

The Longest Drive: Golf and Design of Experiments, Part 3 

Big Ten 4th Down Calculator: Week 3

Image may be NSFW.
Clik here to view.
Every single Big Ten team played a conference game this week, giving us the most 4th downs to analyze yet. Last week, 4 of the 6 games were decided by one possession. This week only 2 of the 7 games were decided by one possession, so let's see if the losing teams missed opportunities to keep the game close! But first, a quick refresher on what this is. 

I've used Minitab Statistical Software to create a model to determine the correct 4th down decision. And for the rest of the college football season, I'll use that model to track every 4th down decision in Big Ten Conference games. However, the decision the calculator gives isn’t meant to be written in stone. In hypothesis testing, it’s important to understand the difference between statistical and practical significance. A test that concludes there is a statistically significant result doesn’t imply that your result has practical consequences. You should use your specialized knowledge to determine whether the difference is practically significant.

The same line of thought should be applied to the 4th down calculator. Coaches should also consider other factors, like the game situation, their kicker's range, and the strengths and weaknesses of their team. But the 4th down calculator still provides a very strong starting place for the decision making! In fact, we can use the model to create a handy-dandy chart that gives a general idea of what your 4th down decision should be!

Image may be NSFW.
Clik here to view.
4th Down Decision Chart

Also, for each game I'll break the analysis into two sections: 4th down decisions in the first 3 quarters, and 4th down decisions in the 4th quarter. The reason to separate the two is because in the first 3 quarters, coaches should be trying to maximize the amount of points they score. But in the 4th quarter, they should be maximizing their win probability. To calculate win probability, I’m using this formula from Pro Football Reference.

Okay, enough of the pregame show, let’s get to the games!

Ohio State 49 - Maryland 21

Last week, Maryland had 8 fourth down decisions in the first 3 quarters, all of which ended in punts. And the 4th down calculator didn't disagree with a single one. The same thing couldn't happen again this week, could it?

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Ohio State

5 1 3 1 1 0.19

Maryland

6 0 6 0 0 0

Well! For the 2nd straight week, every single one of Maryland's 4th down decisions resulted in a punt, and the 4th down calculator agreed with every single one of them. As a 33-point underdog, Maryland should have played very aggressively. However, that's kind of hard to do when your average distance to go on 4th down is 12.5 yards. Luckily for Maryland, they were scoring touchdowns on the possessions where they weren't punting, so they were able to keep this close for awhile.  But eventually, Ohio State was able to pull away.

Speaking of the Buckeyes, they called a pretty good game on 4th down. The calculator disagreed with their decision on the opening drive, but it was a close call either way. With a 4th and 2 on their own 28 yard line, Ohio State punted when the calculator says to go for it. But you'll see that the difference in expected points is only 0.19, so punting wasn't a terrible decision. And when you factor in the fact that Ohio State was a 33 point favorite and this was the first drive of the game, there is no reason to criticize Ohio State for taking the low variance option here. 

Now the difference between punting and going for it in your own territory is a close call on 4th and 2, but not so much on 4th and 1. If you punt on 4th and 1, you're giving up over half a point on average, and it's probably even more than that if your offense is as good as Ohio State's. And it looks like Buckeye coach Urban Meyer understands the statistics. For the 2nd straight week, he went for it on 4th and 1 in his own territory when most coaches always punt. And for the 2nd straight week he was rewarded, as Ohio State successfully converted the 4th down and scored a touchdown on the drive. Now if Meyer continues this the rest of the season (and he should!) eventually they're going to get stopped on the 4th and 1. And most likely, the announcers and/or media will announce how stupid of a decision it was. But I'm guessing they're not going to point out these two cases, where going for it directly led to a touchdown. You're not going to convert every single 4th and 1. But you'll convert most, which in the long run will result in more points for your team, as Ohio State is clearly demonstrating. 

4th Down Decisions in the 4th QuarterTeamTime Left4th Down DistanceYards to End ZoneCalculator DecisionCoach DecisionWin Probability Go For ItWin Probability Kick (Punt or FG) Maryland 10:26 8 47 Go for it Go for it 0.24% 0.12% (Punt) Maryland 7:47 14 36 Go for it Go for it 0.25% 0.16% (FG)

 

In the first 3 quarters, Maryland never had a 4th down distance that was small enough to warrant going for it. Well, the distance part of that didn't change in the 4th quarter. But when you're making decisions based on win probability instead of expected points, sometimes you have to go for it no matter what the distance. And that's exactly what Maryland did. Down 14 points in the 4th quarter, they went for it twice on 4th and long. Although their chance of winning was slim either way, Maryland doubled their win probability by going for it on 4th and 8, and increased it by 56% on the 4th and 14. And with 10:26 left, I feel that most coaches would have punted on 4th and 8 from midfield. So props to the Terps for doing what they could to try and win.

Michigan 38 - Northwestern 0

Last week Michigan got a slow start against Maryland before breaking the game open in the 2nd half. This week, they jumped on Northwestern before the Wildcats even knew what happened.

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Northwestern

8

1 7 1 0 0.88 Michigan

5

1 4 1 0 0.23

Last week Northwestern made 4 different 4th down decisions that the calculator disagreed with, costing them over 2 points. There was only one disagreement this week, but it was a really bad one. Down 14-0 in the 1st quarter, they Wildcats had a 4th and 1 on the Michigan 25. Northwestern coach Pat Fitzgerald decided to kick the field goal instead of going for it. Big Ten kickers make a 42-yard field goal about 67% of the time. Big Ten offenses convert on 4th and 1 about 68% of the time. So not only does going for it have the higher success rate, but you still have the opportunity to score a touchdown. The decision to kick was bad to begin with it, but it was made worse when they missed the field goal. I thought Northwestern was supposed to be the smart school.

Michigan's disagreement wasn't really bad at all. They punted on 4th and 9 when the calculator would have kicked a 51-yard field goal. The difference is only 0.23 points to begin with, but the calculator also acknowledges that coaches know more about their kicker's range than it. Wolverine kicker Kenny Allen has never attempted a 50+ yard field goal, so it's possible that distance just isn't in his range. If you rule out a field goal, the decision to punt is slightly better than going for it, so no problem with Michigan's decision.

This game was 31-0 Michigan going into the 4th quarter, so we won't bother analyzing any of those 4th down decisions.

Minnesota 41 - Purdue 13

Purdue followed up its strong showing against Michigan State with a dud against Minnesota. Minnesota followed up its dud against Northwestern with a strong showing against Purdue. 

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Minnesota

5 2 3 2 0 0.94 Purdue 6 1 5 0 1 0.57

It didn't matter in the end, but Minnesota left some points on the board on their first two possessions. On their first drive, they punted on 4th and 1 from their own 34 yard line. Ask Ohio State if they think that's the correct decision. Then the next time they got the ball, they punted on 4th and 4 from the Minnesota 39. You would have to guarantee that the ball would be downed inside the 5 yard line to make punting the correct decision. And since there is never a guarantee with punting, the correct decision is to go for it.

Speaking of punting, Purdue's "incorrect" decision was to go for it when the calculator suggested punting. I put "incorrect" in quotes, because I think this was a decision where going against the calculator was the right call. The situation was a 4th and 10 for Purdue at the Minnesota 40 with 5 minutes left in the 3rd quarter. And at the time, Purdue was down 24-6. I said earlier that teams should maximize points until the 4th quarter, at which point they should maximize win probability. But I don't think that's a hard and fast rule. If we use win probability, going for it is the correct call. And this assumes that Purdue would down the punt at the 10 yard line. If they sail it into the end zone for a touchback, then going for it is correct in both expected points and win probability. Considering they were down 18 points and needed to score, Purdue was correct to go for it here. So don't worry Darrell Hazell, I'm not going to count this decision against you in the team summary. Cause, you know, I'm sure you read this and were very concerned.

Unfortunately for Hazell, Purdue did not convert and Minnesota scored a touchdown on the next possession. That means it's time to move on to the next game.    

Penn State 29 - Indiana 7

Apparently, the week of the Indiana-Penn State game has been dubbed as punt week. And it did not disappoint. 

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Indiana

10

1 9 0 1 0.56 Penn State 7 0 7 0 0 0

Sixteen punts, and this is just through the first 3 quarters! And the 4th down calculator agreed with 15 of them. That's B1G! As for that one disagreement...oh Indiana. You're so close, but yet so far. The Hoosiers are consistently one of the worst teams in the Big Ten, and it's usually a result of a poor defense. So since they have nothing to lose and the other team will likely score no matter where their starting field position is, Indiana should always implement an aggressive, high-variance strategy. And so far this year they have...kind of. 

Last week Indiana went for a 4th and 1 from midfield against Ohio State. Great! That's exactly the aggressive 4th down decision Indiana should be making. They converted, but 4 plays later they punted on 4th and 2 from the Ohio State 39. That...that is not the type of decision they should be making.

Well, the same thing happened against Penn State. After scoring their only touchdown, Indiana followed it up by recovering a surprise onside kick. A team playing on the road should onside kick if they can recover it 44% of the time or better. I couldn't find college data, but in the NFL the surprise onside kick rate is close to 60%. I imagine it would be similar in college, making the decision to onside kick a great call for Indiana. But yet again, 4 plays later they punted on 4th and 4 from the Penn State 45. Now, the decision to punt or go for it is very close. Assuming you down the punt at the 10 yard line, the calculator says to punt. But the difference in expected points is 0.07, so either decision is really fine. But the kicker is that we just saw Indiana attempt an onside kick! So Indiana was willing to risk giving Penn State the ball in Indiana territory. But risk giving it to them at their own 45 yard line? I guess it was too rich for their blood.

As for the Hoosier decision the calculator actually disagreed with, they punted on 4th and 1 from their own 34 yard line. Again, ask Ohio State about the dangers of actually going for it there. And at the time Indiana was down 19-7. I know it was punt week and all, but they needed points. 

Penn State didn't make an incorrect 4th down decision, and they had this game in control in the 4th quarter. So it's time to finally move on to the close games!

Iowa 29 - Illinois 20

Last week Iowa head coach Kirk Ferentz was flawless in his 4th down decision making. Can he follow it up with another perfect week?

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Illinois 8 1 6 2 0 0.56 Iowa 8 1 5 1 2 1.27

Blast! The streak is over, as Ferentz made one bad decision. But before we talk about that, I want to talk about two very good decisions he made. On the same drive, Iowa had a pair of 4th and 1s. The first came at the Illinois 6 yard line, and the other was at the Illinois 1. Kicking a field goal in either of these situations is by far the worst decision a coach can make. A home team that kicks a field goal on 4th and 1 from the 1 is giving up 1.77 points, the highest difference you'll get with the 4th down calculator. Iowa converted the first 4th and 1, but failed on the 2nd. But the decision to go for it is so strong because failing isn't really failing at all. Even though Illinois got the ball back, Iowa was still more likely to be the next team to score. And that's exactly what happened. Illinois punted and Iowa got the ball close to midfield, where they completed a 5 play touchdown drive.

That brings us to the disagreement. With 15 seconds left in the half, Iowa kicked a field goal on 4th and goal from the 1. Keep in mind it was the end of the half. So if you fail, the terrible field position for your opponent won't matter since they'll run out the clock and get to halftime. But that only changes the decision from "worst one you can make" to "really bad". Big Ten teams score on 4th and goal from the 1 about 59% of the time, so your expected value is 4.1 points (I multiplied 0.59 by 6.96 instead of 7 to account for the 4% of times Big Ten kickers miss the extra point). That's greater than what you'll get from making a field goal, so the choice is still clearly to go for it.

Illinois's disagreement was punting on 4th and 1 from their own 38-yard line. Illinois was a double-digit underdog playing on the road. Punting on 4th and 1 is not the strategy you want to take to pull the upset.

And that brings us to the 4th quarter, where there is one decision I want to discuss. With 3:20 left in the game, Iowa was up by 3 points and had a 4th and 5 at the Illinois 16 yard line. The decision seems pretty obvious. Take the points and force Illinois to score a touchdown instead of simply being able to tie the game with a field goal, right? Obviously being up 6 points late in a game is better than being up 3.

Or is it?

I took college football games from 2005-2012, and separated out situations were a team was starting between their own 20 and 40 yard line with 1-5 minutes left in the game, and trailing by 3-6 points. I divided the teams into 2 groups. One group was losing by a field goal (Down a FG). The other group was losing by either 4, 5, or 6 points (Down a TD). I performed a 2 proportions test to determine whether the group that was down a TD actually won more often.

Image may be NSFW.
Clik here to view.
2 Proportions Test

Teams that needed a touchdown actually won more often than teams trailing by only a field goal. And the result was significant at the alpha = 0.10 level. So what is going on here? Well, teams that need a touchdown have to be aggressive and play for the win. But teams only down 3? Once the coach gets into field goal position, their play calling gets conservative and they play for the tie. And we see this results in fewer wins than if they were forced to play for the touchdown.

So Iowa would have actually been better off missing the field goal than making it. And of course, the best decision is to go for every 4th down until you either score a touchdown or turn the ball over on downs. Luckily for Iowa, after they made the field goal Illinois fumbled on their first play. Iowa recovered and ended up kicking another field goal, which was correct since obviously being up 9 points late in a game is better than being up 6.

Or is it?

No, it definitely is.

Wisconsin 23 - Nebraska 21

This game featured strong gusts of wind that caused a 12 yard punt. It's just too bad it didn't happen in the Penn State - Indiana game.

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Wisconsin

9

1 6 2 1 0.56 Nebraska

9

1 7 2 0 0.56

Early in the game, Nebraska had a 4th and 1 at their on 45 yard line. On came the punt team when they should have gone for it. After a few punts were exchanged, Wisconsin had a 4th and 2 at the Nebraska 29. They smartly opted to go for it instead of attempting a field goal. The result was a successful conversion that led to a Wisconsin touchdown. One team did a good job maximizing their expected points early. The other lost their 4th game of the season in the final seconds of the game. 

But it wasn't all sunshine and rainbows for the Wisconsin 4th down decision making. Trailing by 7 late in the 3rd quarter, Wisconsin punted on 4th and 1 from their own 46 yard line. This decision is bad enough, but it was even worse when you consider that their previous punt had traveled 12 yards because of the wind. 12 yards! Why are you punting on 4th and 1 after that? But luckily for Wisconsin, the wind must have died down because they booted a 53 yard punt that took a very fortunate bounce and went out of bounds at the 1 yard line.

Then things got very interesting in the 4th quarter.

4th Down Decisions in the 4th QuarterTeamTime Left4th Down DistanceYards to End ZoneCalculator DecisionCoach DecisionWin Probability Go For ItWin Probability Kick (Punt or FG) Wisconsin 6:17 2 24 Go for it FG 81% 77% Wisconsin 1:26 4 21 FG FG 56% 64% Nebraska 1:03 5 26 Punt Punt 47% 51%

Leading by 3, Wisconsin decided to kick a field goal on 4th and 2 with 6 minutes left. They actually should have gone for it. Big Ten kickers make 42 yard field goals about 68% of the time, where as converting on 4th and 2 is successful about 60%. The chances of converting are slightly smaller, but that is outweighed by the fact that if you convert you'll be able to run more time off the clock and you can still score a touchdown to make it a 2 score game. And keep in mind that this doesn't even take into account the "3 point lead is better than a 6 point lead" situation we just saw before. So the difference is much larger than displayed above. Wisconsin clearly should have gone for it.

But Wisconsin made the field goal, and Nebraska quickly scored a go ahead touchdown to go up 1 point. And then Wisconsin gave us a great example of why it's better to be up 3 points than 4, 5, or 6. Starting at their own 9 yard line, Wisconsin was able to move to the Nebraska 27 yard line, calling 4 passes and 2 rushes. But then Wisconsin called 3 straight running plays, including one on 3rd and 7. Wisconsin got into field goal range, became ultra conservative, and decided they were fine with a 40+ yard field goal. It's almost as if coaches don't realize that the closer you are to the end zone, the better chance your kicker has of actually making the field goal. Especially college kickers. This is bad enough when the field goal is for the lead. But it's even worse when you're setting up a game tying field goal instead of trying to win in regulation with a touchdown. And that's exactly why you'd rather be up 3 points than 4, 5, or 6. And Wisconsin did its part to support my narrative, missing the field goal and giving the ball back to Nebraska.

But the breaks kept coming for Wisconsin (or rather, maybe they were going against Nebraska). With all 3 time outs left, Wisconsin was able to make Nebraska punt, and they drove down the field and made the game-winning field goal as time expired. Oh Nebraska, if only you could get that punt on 4th and 1 in the first quarter back.   

Michigan State 31 - Rutgers 24

For the 2nd straight week, Michigan State flirted with the possibility of a SPARTY NO! But for the 2nd straight, it was narrowly avoided.

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Michigan St

5 1 2 2 1 0.11 Rutgers 6 0 5 0 1 0

The decision making was solid in this game. Well, until the end, but let's not get ahead of ourselves. The only disagreement came when Michigan State went for it on 4th and 10 from the Rutgers 34 yard line. The calculator suggests kicking a 51-yard field goal. However, in his 3 years at Michigan State, kicker Michael Geiger has never attempted a 50+ yard field goal. So let's assume that is out of his range. The calculator then suggests punting, but you'll see the difference between punting and going for it is only 0.11 points. So there is no issue if the coach wants to go for it. And it paid off for Michigan State, as they converted and scored a touchdown two plays later.

Rutgers had their own 4th down conversion that led to a touchdown. Late in the 3rd quarter they had a 4th and 1 at the Michigan State 29. It doesn't matter where you are on the field, you should always go for it on 4th and 1. Rutgers did, they converted, and on the very next play they scored a touchdown.

4th Down Decisions in the 4th QuarterTeamTime Left4th Down DistanceYards to End ZoneCalculator DecisionCoach DecisionWin Probability Go For ItWin Probability Kick (Punt or FG) Michigan State 12:22 6 44 Punt Punt 77% 80% Rutgers 7:15 6 44 Go for it Go for it 23.5% 23.2% Rutgers 4:21 4 4 Go for it FG 43% 35%

With 12:22 left in the game, Michigan State correctly punted on 4th and 6 from the Rutgers 44 yard line. They downed the ball at the Rutgers 5 yard line, where the Scarlet Knights had an incredible drive. In the middle of the drive, they had a 4th and 6 from the Michigan State 44. The stats say to go for it, but just it's so close either decision is really fine. But considering Rutgers was a 2 touchdown underdog, I think the more aggressive call was the correct one. And that's exactly what Rutgers did, going for and converting the 4th down. But then Rutgers made the worst decision the 4th down calculator as seen yet. With 4th and goal from the Michigan State 4, Rutgers kicked a field goal to tie instead of going for the win. This decision lowered their win probability by 8%! By kicking, Rutgers needed all of the following to happen.

  1. Make the field goal (likely, but not automatic)
  2. Stop Michigan State from scoring in regulation
  3. Win in overtime

The third item on the list is really the biggest. Too often we equate tying the game to taking the lead. But the latter is really so much more valuable. By going for it on 4th down, Rutgers had the opportunity to take the lead. And even if they failed, Michigan State would be starting at their own 4 yard line. There was a good chance Rutgers would get the ball back with another chance to tie or win in regulation. But instead, they decided to tie the game, and we all know how the rest of the game went.

Summary

Each week, I’ll summarize the times coaches disagreed with the 4th down calculator and the difference in expected points between the coach’s decision and the calculator’s decision. I’ll do this only for the 1st 3 quarters since I’m tracking expected points and not win probability. I also want to track decisions made on 4th and 1, and decisions made between midfield and the opponent’s 35 yard line. I’ll call this area the “Gray Zone.” These will be pretty sparse now, but will fill up as the season goes along. Then we can easily compare the actual outcomes of different decisions in similar situations.

Team Summary Team Number of Disagreements Total Expected Points Lost Northwestern 5 3.22 Indiana 3 2.56 Minnesota 4 2.07 Nebraska 3 1.38 Iowa 1 1.27 Illinois 3 1.19 Wisconsin 3 1.18 Michigan 2 1.16 Ohio State 3 0.92 Penn State 1 0.8 Michigan State 3 0.62 Rutgers 1 0.3 Purdue 1 0.24 Maryland 0 0

 

4th and 1

Yards To End Zone

Punts

Average Next Score After Punt

Go for It

Average Next Score after Go for it

Field Goals

Average Next Score After FG

75-90

1

7

0

0

*

*

50-74

8

0 3 4.7

*

*

25-49

0

0

3 1.33 1 -7

1-24

*

* 5 2 2 3

 

The Gray Zone (4th downs 25-50 yards to the end zone)

4th Down Distance

Punts

Average Next Score After Punt

Go for It

Average Next Score after Go for it

Field Goals

Average Next Score After FG

1

0

0

3

-1

0

0

2-5

9 1.22 6 -1.67 2 -2

6-9

8 0.625 2 -2 4 2.5

10+

9 0.67 1 7 5 1.2

 

Statistical Analyses of the House Freedom Caucus and the Search for a New Speaker

Image may be NSFW.
Clik here to view.
Flag of the United States of America
With Speaker John Boehner resigning, Kevin McCarthy quitting before the vote for him to be Speaker, and a possible government shutdown in the works, the Freedom Caucus has certainly been in the news frequently! Depending on your political bent, the Freedom Caucus has caused quite a disruption for either good or bad. 

Who are these politicians? The Freedom Caucus is a group of approximately 40 Republicans in the U.S. House of Representatives. You may also know this group as the “Hell No” caucus, and they are a key part of the fractured Republican House. In all of the articles and blogs I’ve read, they are described an extremely conservative, far-right group. This extreme conservatism is generally considered to be the defining characteristic.

However, in the Republican presidential race, we’ve seen that the usual debate over the candidates’ conservative credentials has been overshadowed by the outsiders. In other words, there’s an assessment of each candidate’s conservativeness as well as their establishmentarianism.

Is there evidence that an establishment/anti-establishment split is also a factor among the Republicans in the House of Representatives and their search for a new Speaker of the House? In this blog post, I’ll use data and statistical analyses to test these hypotheses!

Data for these Analyses

I obtained the data for these analyses from voteview.com. This group runs an algorithm that uses roll call votes to estimate each politician’s conservativeness and their support of the party establishment. I added a variable that identifies Freedom Caucus membership using the information in this Wikipedia article.

For these data, higher conservative scores indicate that the politician is more conservative. Higher establishmentarianism scores indicate that the politician is more supportive of the establishment while lower scores indicate an anti-establishment position.

Scatterplot of the House Republicans

Graphing the data is always a good place to start for any analysis. The scatterplot below displays a point for each Republican member of the House by their Establishment and Conservativeness scores. The data points that are further right are more conservative. The points that are closer to the bottom are more anti-establishment. Red points identify members of the Freedom Caucus.

Image may be NSFW.
Clik here to view.
Scatterplot of House Republicans

The graph shows that not all members of the Freedom Caucus are extremely conservative. Some are right in the middle! However, all members of the Freedom Caucus are at least on the right half of the graph. These members are also in the bottom, anti-establishment half, which keeps the door open for the hypothesis that we'll test. 

Binary Logistic Regression

Let’s test this formally with statistics. To do this, I’ll use binary logistic regression in Minitab statistical software because the response variable is binary. The Republican House members can only either belong to the Freedom Caucus (Yes) or not (No).

Image may be NSFW.
Clik here to view.
Response information table for binary logistic regression

The Response Information table displays general information about the analysis. There are 36 members of the Freedom Caucus out of 247 House Republicans in the analysis.

Image may be NSFW.
Clik here to view.
Deviance table for binary logistic regression

The Deviance Table is like the ANOVA table in a linear regression analysis. This table shows us that both the Conservativeness and Establishmentarianism of the politicians are very statistically significant (p = 0.000). We can conclude that changes in the values of these two predictors are associated with changes in the probability that a politician is a member of the Freedom Caucus.

The interaction between the two predictors is insignificant and I did not include it in the final model.

Graph the Results to Understand the Binary Logistic Regression Model

The easiest way to understand these results is to graph them. When you fit a variety of model types in Minitab 17, the analysis stores that model in the worksheet. You can then use a variety of handy features to quickly and easily explain what your model really means.

The graph below displays the probabilities associated with the values of the two predictors. The highest probabilities for Freedom Caucus membership are in the bottom right for politicians who are both very conservative and very anti-establishment.

Image may be NSFW.
Clik here to view.
Contour plot of the probability of belonging to the Freedom Caucus

In the main effects plot below, Minitab graphs the effect of each variable independently while the other variable is held constant.

Image may be NSFW.
Clik here to view.
Main effects plot of conservativeness and establishmentarianism

On the Conservativeness side, the graph shows that as a politician becomes more conservative (by moving right), their probability of membership in the Freedom Caucus increases. In fact, the probability really starts to shoot up fast around a score of 0.5. On the Establishmentarianism side, as a politician becomes more anti-establishment (by moving left), their probability of Freedom Caucus membership also increases at an increasing rate.

Collectively, the statistical analyses show that membership in the Freedom Caucus is not as simple as being on the far right end of the political spectrum. Instead, this group has a mixture of very conservative and anti-establishment sentiment driving their actions. Understanding this multidimensional fracture in the Republican Party helps explain why it is so difficult to form a more cohesive caucus and to choose a new Speaker of the House.

When Kevin McCarthy refused to run for Speaker, many called on Paul Ryan as the ideal candidate to unify the House Republicans. Although Ryan appears to have declined this call to duty, he provides a notion of what the ideal Speaker looks like in this new environment.

To compare McCarthy to Ryan across both characteristics, I standardized their raw scores to account for any differences in the scaling of the two variables. The table shows their Z-values, which is the number of standard deviations that each politician falls from the House Republican mean for each variable.

  Conservatism Establishmentarianism McCarthy -0.169 0.549 Ryan 0.496 -1.180

Compared to McCarthy, Ryan has a moderately more conservative score, but he is notably more anti-establishment. This larger difference indicates which way the political winds are blowing!

Troy Aikman or Joe Montana Might be the Best Super Bowl Quarterback Ever

Part of the fun of sports statistics is that many of the questions are unanswerable in an absolute sense. After a project to improve patient satisfaction scores at a hospital, you can and should measure patient satisfaction scores to determine that the increase met your goals. In sports, we don’t have exact comparisons from player to player, especially when their careers don’t even overlap. Thus, it wasn’t surprising that some people disagreed with my case that Tom Brady is the Best Super Bowl Quarterback EverTo challenge Brady’s anointment, two suggested follow-up analyses are A) better quarterbacks should win Super Bowls by larger margins, and B) better quarterbacks should outperform their competition. Let’s take a look at these statistics and see what happens.

Margin of Victory

In my initial analysis, I used a very aggressive standard for Super Bowl victories to winnow the field of quarterbacks in consideration. Thus, we’re excluding some quarterbacks with extraordinary performances in the Super Bowl like Phil Simms, Jim Plunkett, and Kurt Warner. The group we’re looking at is Terry Bradshaw, Troy Aikman, Tom Brady, and Joe Montana. Here are the margin of victory statistics in an individual value plot:

Image may be NSFW.
Clik here to view.
Troy Aikman has the highest median margin of victory.

By looking at the median margins of victory, Troy Aikman leads this category, but I think it reveals a problem with using margin of victory by itself as a measure of a quarterback’s success in the Super Bowl. Dallas forced 9 turnovers in Super Bowl XXVII. An overmatched Buffalo team that lost Jim Kelly in the second quarter scored enough points to win even if Aikman had thrown 0 touchdown passes. I originally made the argument that margin of victory should count against a quarterback, but let’s see if we can get a more detailed view by including more statistics.

The 3-D scatterplot shows three variables:

  • Spread: The expected margin of victory before the Super Bowl. Positive point spreads indicate that the underdog won.
  • QBR difference: The difference between the winning quarterback's rating and the average rating of opposing quarterbacks for the losing team during the playoffs.
  • Margin of victory: The difference between the scores of the winning and losing teams.

The legend shows that I used different symbols for these events:

  • Favorite: The quarterback’s team was expected to win the Super Bowl.
  • Push: The winning and losing teams were considered equally likely to win the Super Bowl.
  • Underdog: The quarterback’s team was expected to lose the Super Bowl.

Image may be NSFW.
Clik here to view.
Of these 4, only Tom Brady has won when his team was not the favorite.

Image may be NSFW.
Clik here to view.
No points are high in all three variables.

In this analysis, I would say that high margins of victory are good, higher differences are good, and higher spreads are good. Interestingly, there are no performances where all 3 variables are high. The victories by Aikman and Montana have the highest margins of victories, but the spreads indicate that those games were not supposed to be competitive. Brady has won the only Super Bowl where any of these quarterbacks was an underdog, but the difference between his passer rating and what the defense allowed is not as high as many other Super Bowls. Terry Bradshaw was the quarterback in the only 2 games where the difference between his rating and the opponent’s allowed rating was greater than 60, but in those games, the margin of victory was only 4.

Outplaying the Losing Quarterback

Let’s also consider the relative performance of the winning quarterback to the losing quarterback and see whether it makes a difference. The raw data look like this. I’ve added in labels to show the losing quarterback who threw the most passes in the game. Asterisks indicate players who do not have 1,500 career passing attempts.

Image may be NSFW.
Clik here to view.
Joe Montana exceeds the quarterback's he beat in the Super Bowl by the most, on average.

The median difference between Super Bowl-winning quarterbacks is highest for Joe Montana, who also has an impressive list of opponents. Because of the different quality of opponents, it’s probably not fair to look only at the difference between the winning and losing quarterbacks. After all, beating Ken Anderson by 4.8 points might be more impressive than beating Frank Reich by 80.3. As a measure of quality of the opponent, I want to include career quarterback rating in the analysis. League average quarterback ratings increased by about 0.5 every year between 1940 and 2007, so I’m making an adjustment to the career quarterback ratings of the losers.

Adjusted QBR = Losing quarterback’s career QBR – 0.5*(Year of Super Bowl – 1970)

It’s also worth considering how well the losing quarterback played. While Elway and Tarkenton had careers that got them to the hall of fame, their Super Bowl efforts from the chart above both resulted in quarterback ratings below 20. Outplaying a quarterback with a rating below 20 is something Jameis Winston has done every game of his career so far.

This 3-D scatterplot shows these variables:

  • Win – Loss QBR: The difference between the winning and losing quarterback ratings for that Super Bowl
  • Passer Rating Loser: The passer rating of the quarterback on the losing team
  • Adjusted Career QBR Loser: The career quarterback rating of the losing quarterback

Where there are labels, they give the name of the losing quarterback and the number of the Super Bowl.

Image may be NSFW.
Clik here to view.
Each symbol represents a different quarterback.

Image may be NSFW.
Clik here to view.
No points are high in all 3 variables.

The ideal performance would be high in every variable and, again, there are no points clearly in that position. Montana and Bradshaw both outplayed quarterbacks by huge margins, but on days when those quarterbacks posted incredibly low ratings. Tom Brady played against quarterbacks who posted the highest passer ratings in their Super Bowls, but both those quarterbacks had higher passer ratings than he did. Terry Bradshaw’s victories over Roger Staubach turn out to be the most impressive victories in terms of the other quarterback’s career, but he didn’t outplay Staubach by as much as Montana outplayed Marino.

Conclusion?

The highest median margin of victory belongs to Troy Aikman. The highest median difference between the winning and losing quarterbacks belongs to Joe Montana. I still contend that looking at these statistics by themselves wouldn’t let us make an easy decision about who the best Super Bowl quarterback is. Posting a better passer rating than John Elway sounds great, but it’s not as impressive when Elway’s passer rating is below 20. Winning the Super Bowl by 35 is impressive, but less so when it’s against a Buffalo team that wasn’t expected to be competitive before they lost their starting quarterback. Fortunately, Minitab gives us the power to do a multivariate analysis that considers all of these variables simultaneously. We’ll look at that type of analysis next time.

We did a lot with the 3D scatterplots today. If you’re ready for more, check out What is a 3D scatterplot? in the Minitab Support Center.

Big Ten 4th Down Calculator: Week 4

Image may be NSFW.
Clik here to view.
Week 4 in the Big Ten featured a couple blowouts, an insane comeback, and the worst punt in the history of recorded time. But before we get to all that, here's my weekly blurb on what exactly the 4th down calculator is. 

I've used Minitab Statistical Software to create a model to determine the correct 4th down decision. And for the rest of the college football season, I'll use that model to track every 4th down decision in Big Ten Conference games. However, the decision the calculator gives isn’t meant to be written in stone. In hypothesis testing, it’s important to understand the difference between statistical and practical significance. A test that concludes there is a statistically significant result doesn’t imply that your result has practical consequences. You should use your specialized knowledge to determine whether the difference is practically significant.

Apply the same line of thoughtto the 4th down calculator. Coaches should also consider other factors, but the 4th down calculator still provides a starting point for the decision making! 

I'll break the analysis for each game into two sections: 4th down decisions in the first 3 quarters, and 4th down decisions in the 4th quarter. In the first 3 quarters, coaches should try to maximize the points they score. But in the 4th quarter, they should maximize their win probability. To calculate win probability, I’m using this formula from Pro Football Reference.

Wisconsin 24 - Purdue 7

While this was not the most entertaining of games, it did give us something the Big Ten 4th down calculator has yet to see.

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Purdue

6 0 5 0 1 0

Wisconsin

3 0 2 1 0 0


For the first time ever, the calculator didn't disagree with a single 4th down decision made by either team. Of course, that doesn't give me much to talk about here, so I'll point out Purdue's decision where they decided to go for it. They had a 4th and 1 on the Wisconsin 20 yard line. Announcers say all the time that you have to "take the points" in this situation (even though Big Ten kickers make a 37 yard field goal only 75% of the time, so there is no guarantee of points either way). But kicking a field goal in this situation is a really bad decision. On average, you'll score about one additional point by going for it over kicking.

But it's actually even more than that.

When calculating the expected value on 4th down, the calculator assumes you only gain the yardage needed to get the 1st down. But when a team converts they'll usually gain more yards than they need, improving their field position and increasing their expected points even more than the calculator accounts for. This makes the case for going for it even stronger. Luckily for Purdue, they made the correct decision, gaining 4 yards on 4th and 1. And five plays later, they were in the end zone for the touchdown.

"Take the points," indeed.

4th Down Decisions in the 4th QuarterTeamTime Left4th Down DistanceYards to End ZoneCalculator DecisionCoach DecisionWin Probability Go For ItWin Probability FG Wisconsin 7:16 1 1 Go for it Go for it 99.7% 99.6%

There is only one 4th down decision in the 4th quarter worth talking about. Wisconsin was up by 10 and had a 4th and goal at the Purdue 1 yard line. The win probability was so high for Wisconsin that the difference between kicking a field goal and going for it were almost the same. But when we apply our "specialized knowledge" to the situation, going for it was clearly the correct decision. 

First, scoring a touchdown forces Purdue to score 3 times. Kicking a field goal keeps the number of times they have to score at 2. And second, consider the situation where Wisconsin fails on 4th and 1, and Purdue ends up driving 99 yards and scoring a touchdown. It's now a 3-point game. Compare that to the situation where Wisconsin kicks a field goal, then Purdue scores a touchdown, making it a 6 point game. Last week we saw that late in games, having a 3 point lead is actually better than a 4, 5, or 6 point lead. So Wisconsin was correct to go for it, and they ended up scoring a touchdown, effectively ending the game.

Iowa 40 - Northwestern 10

Don't look now, but Iowa is undefeated and doesn't have a team with fewer than 3 losses remaining on their schedule.

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Iowa

5

2 4 1 0 0.53 Northwestern 8 2 7 1 0 1.01


After last week, Northwestern was leading the Big Ten in expected points lost due to their 4th down decision making. And things didn't change this week, as for the 3rd week in a row the Wildcats left over a point on the table. Their worst decision was kicking a field goal on 4th and goal from the 3 yard line. Anytime you're close to the end zone, your goal should be touchdown or bust. Because even if you end up turning the ball over on downs, you're still more likely to be the next team to score due to your opponent's terrible field position. Northwestern left 0.83 points on the field by deciding to kick a field goal. And at the time they were only down 16-7, so the game was still in doubt!

4th Down Decisions in the 4th QuarterTeamTime Left4th Down DistanceYards to End ZoneCalculator DecisionCoach DecisionWin Probability Go For ItWin Probability Punt Northwestern 13:06 7 44 Punt Punt 20.4% 21.1%

Northwestern was still down 16-10 when they found themselves in a 4th and 7 at the Iowa 44 yard line. The calculator agreed with Northwestern's decision to punt, but it was pretty close. Unfortunately for Northwestern, the punt sailed into the end zone for a touchback. Had we been able to predict the future and known the punt would be a touchback, the win probability for a punt would have dropped to 18%, and we would have suggested going for it. But we can't predict the future, so punting was still the correct decision. But it was after this punt that things unraveled for Northwestern. Iowa went on a 80 yard touchdown drive, Northwestern lost a fumble on their next offense play, Iowa scored another touchdown, and the rout was on.  

Nebraska 48 - Minnesota 25

Hey, look at that! Nebraska didn't lose a close game!

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Nebraska

3 1 1 2 0 0.21 Minnesota 5 2 5 0 0 0.75

Want to avoid playing in a close game? Just have only three 4th downs in the entire first three quarters. Nebraska was too busy scoring touchdowns to be bothered with 4th down decisions. Although, the one they messed up was kicking a field goal on 4th and 3 from the Minnesota 14 yard line. It's close, but in the long run you'll score more points by going for it. And to help support my argument, Nebraska missed the field goal. Luckily for them, it did not matter.

Early in the game, Minnesota punted on 4th and 2 from their own 14 yard line. The calculator says to go for it, but the difference in expected points is only 0.19, and the consequences are disastrous if you fail. So punting isn't terrible in that situation. But later, Minnesota punted on 4th and 1 from their own 45 yard line. That decision cost them over half point, and as the final score indicates, they desperately needed points.

Nebraska led comfortably the entire 4th quarter, so we'll move on to the next game.

Ohio State 38 - Penn State 10

After a number of underwhelming performances, Ohio State finally had a convincing victory.

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Penn State

8 2 6 1 1 1.21 Ohio State 5 0 5 0 0 0

Ohio State actually had one "disagreement" with the calculator. On 4th and 9 from the Penn State 34 yard line, they punted when the calculator would have kicked a 51 yard field goal. However, Buckeye punter Jack Willoughby has never attempted a 50+ yard field goal, and is 0-3 from 40-49 yards. Assuming a 51-yard field goal is out of his range, the correct decision was to punt. So I did not count that decision against Ohio State.

Penn State found itself in a similar situation. On a 4th and 8 from the Ohio State 31 yard line, they went for it when the calculator suggests kicking a 48 yard field goal. And Penn State kicker Joey Julius is 1 for 2 on 40-49 yard field goals this year, and James Franklin also had him attempt a 50 yard field goal too (the snap was fumbled, so he never kicked it). So it seems as if Julius has the range. As a big underdog against the #1 team in the country, Penn State should have implemented an aggressive strategy. So if the difference in expected points was pretty close, there was nothing wrong with going for it. But the difference in expected points was 0.65. With such a large difference Penn State should have kicked the field goal.

And in the game, the result ended up being worse than the decision. Nittany Lion quarterback Christian Hackenburg was sacked on the play, and could clearly be seen limping the rest of the game. A hobbled quarterback probably hurt Penn State's chances of winning more than any 4th down decision could have.

But all was not lost for Penn State. Late in the 3rd quarter they were only down 11 points, but had a 4th and 1 from their own 45 yard line. They decided to punt when they should have gone for it. And this decision is even worse when you factor in the specialized knowledge Franklin had that the calculator doesn't. On all punts, the calculator assumes it travels a net of 40 yards (the Big Ten average for punts). But Penn State may in fact have the worse punters in the Big Ten. Earlier in the game, Penn State had punts of 28 yards, 32 yards, and 29 yards. And these weren't because of returns by Ohio State. They were going out of bounds. Punting on 4th and 1 is bad enough with an average punter, but it gets even worse when you're only gaining 30 yards of field position. And sure enough their punt on 4th and 1 traveled 30 yards and went out of bounds.  

In the 4th quarter, Penn State did correctly go for it on a 4th and 2 instead of kicking a field goal when they were down by two touchdowns. But they failed to convert, then Ohio State scored a touchdown, and the game got out of reach.

Rutgers 55 - Indiana 52

Wait, this game had 107 points....in regulation? Are we sure this was a Big Ten game?

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Rutgers 7 1 5 0 2 0.79 Indiana 5 1 1 1 3 0.18


Before the season, I wrote about how Indiana should never punt. "Never" was a little tongue in cheek, but they have good reason to be the most aggressive Big Ten team when it comes to going for it on 4th down. And it looks like they listened this game, as they went for it 3 times and punted only once. Indiana went for it on:

  • 4th and 3 from the Rutgers 35 yard line (gained 33 yards and scored a touchdown the very next play)
  • 4th and 1 from the Rutgers 40 yard line (did not convert, but the next score was still an Indiana touchdown)
  • 4th and 3 from the Rutgers 37 yard line (did not convert, and the next score was a Rutgers touchdown)

The 4th down calculator agreed with all 3 of those decisions. The only decision it did not agree with was kicking a field goal on 4th and 6 from the Rutgers 9 yard line. But you'll see the difference in expected points is only 0.18. With a difference that small, kicking the field goal wasn't too bad of a decision.  

Rutgers on the other hand had a puzzling decision. On 4th and 8 from the Indiana 27 yard line, they went for it instead of kicking a field goal. Scarlet Knight kicker Kyle Federico is 9 for 16 (56%) from 40-49 yards. This isn't too different from the 63% that Big Ten kickers make a 44 yard field goal. And 4th and 8 is only successful 32% of the time. I get that Indiana has a bad defense, but the difference in expected points was 0.79. I don't think the Hoosier defense is that bad to make going for it the correct decision. But the gamble paid off of Rutgers, as they gained 15 yards on the play and scored a touchdown on the drive.

So through 3 quarters, Indiana only had 1 punt and was up 52-33. The strategy to never punt seemed to be doing well. Would it continue in the 4th quarter?

4th Down Decisions in the 4th QuarterTeamTime Left4th Down DistanceYards to End ZoneCalculator DecisionCoach DecisionWin Probability Go For ItWin Probability Punt Indiana 15:00 3 46 Punt Punt 96.5% 97.1% Indiana 6:29 4 69 Punt Punt 45% 47.9%

Remember when I said saying Indiana should "never" punt was tongue in cheek? Can I take that back? There were only two 4th downs the entire 4th quarter. Both resulted in Indiana punts, and both went poorly for the Hoosiers. The first occurred at the start of the 4th quarter. Indiana was up 52-33 and punted on 4th and 3 from the Rutgers 46. The snap went over the punters head, and Rutgers scooped it up and scored a touchdown.  

Now the win probability has punting as the correct decision, but it's very close. And remember that these numbers are for your "average" Big Ten team. The Indiana defense is last in the Big Ten in both points allowed per game and yards allowed per game. And their offense is 3rd in points scored per game and 1st in yards gained per game. It's easy to say after we know that they blew a 19 point lead, but Indiana should have continued their aggressive strategy and gone for it here.

The second punt is a much harder decision. With the score tied at 52, Indiana had a 4th and 4 from their own 31 yard line. The win probability says Indiana increased their win probability by about 3% by punting. But this was a game that already had 100 points scored in it. It really seemed like whoever had the ball last would win. And that's exactly what happened, as Rutgers drove down the field and kicked the game winning field goal as time expired. 

So does that mean Indiana made the wrong decision to punt? Not necessarily. It's hard to quantify how much the bad defense/good offense would impact the numbers. But they definitely impacts it enough to warrant at least thinking about going for it here, and Indiana is probably the only Big Ten team you would say that for. It's too late to take this decision back, but Indiana's next three games are against undefeated Michigan State, undefeated Iowa, and 5-2 Michigan. I wasn't completely serious before, but I am now. Try an entire game without punting, Indiana. You have nothing to lose, and it just may result in pulling a huge upset. Go for it!

Michigan St 27 - Michigan 23

Maybe Michigan should try a game without punting too. 

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Michigan St

5 2 3 0 2 1.42 Michigan 6 3 4 2 0 1.43


Through the first 3 quarters, the 4th down calculator disagreed with half of Michigan's 4th down decisions. The first one was very minor, as they kicked a field goal on 4th and 5 from the Michigan State 20 when the calculator would have gone for it. However, the difference in expected points is only 0.04. So really, either decision is fine. But Michigan left some points on the table in the 3rd quarter. They punted on a 4th and 1 from their own 33, giving up 0.56 points. Then they kicked a field goal on 4th and goal from the 3, giving up another 0.83 points. In total, it's almost a point and a half that Michigan left on the field.

A close football game in the 4th quarter can often be decided by the bonce of a ball rather than the talent of either team. A fumble that could bounce right back to the ball carrier or to the defense. A tipped pass that might go right to a defender or fall safely to the ground. That's why the goal of every coach should be to make sure the score isn't close enough to have those random bounces matter. Michigan had chances in the 3rd quarter to put the game away. But instead, they decided to play it safe. And we all know how that ended up. 

Meanwhile, the 4th down calculator is starting to notice a trend with Michigan State. They are consistently passing on 40+ yard field goals and going for it on 4th down instead. If the distance is short, the 4th down calculator fully supports this decision. But Michigan State keeps doing it on 4th and long. Against Michigan they passed on a 45 and 49 yard field goal and instead went for it on 4th and 8 both times. 4th and 8 is such a hard distance to convert, that the calculator prefers to kick a field goal. And Michigan State's field goal kicker is 12 for 17 (71%) for his career from 40-49. This is actually much higher than the value the calculator uses (61% and 56% respectively). Assuming there isn't anything wrong with their kicker, these decisions are very costly for the Spartans, as they cost them almost a point and a half. And the result didn't work out for them either. Michigan State failed to convert both 4th downs, and Michigan scored a touchdown both times on the very next drive.

4th Down Decisions in the 4th QuarterTeamTime Left4th Down DistanceYards to End ZoneCalculator DecisionCoach DecisionWin Probability Go For ItWin Probability Kick Michigan St 14:42 10 32 Go for it Go for it 16.9% 16.8% (FG) Michigan 12:05 9 44 Punt Punt 84.3% 88.4% (Punt) Michigan St 10:51 3 89 Go for it Punt 7.2% 6% (Punt) Michigan 9:25 2 20 Go for it FG 93.8% 92.7% (FG) Michigan St 6:41 10 40 Punt Punt 28% 29.6% (Punt)


This was a pretty wild 4th quarter. It started with Michigan State continuing their strategy of passing on 40+ yard field goals. Except this time because they were losing by 6 points, the win probability actually suggested going for it. But with the difference being one tenth of a percent, either decision really would have been fine. The Spartans went for it, but were unsuccessful, giving Michigan the ball back.

The Wolverines then had a 4th and 9 at the Michigan State 44. On the surface you might think that being aggressive and making it a two score game might be the correct decision. But 9 yards is just too far of a distance, and Michigan correctly punted.

This led to Michigan State having a 4th and 3 on their own 11 yard line. Yes, the field position is horrible, but this is something you have to consider about going for it. Losing by 6 in the 4th quarter, odds are you're going to have to go for it on 4th down at some point. So often you see coaches defer going for it until they're desperate, and often they end up going for it on a much longer distance than one they passed on earlier. And that's exactly what happened here. They decided to punt on 4th and 3, and ended up going for it on 4th and 19 at the end of the game. And yes, if they fail, Michigan has great field position. But they're going to get great field position anyway! In fact, Michigan returned the punt to the Michigan State 28 yard line, so the punt only got them 17 yards of field position.

Michigan used that great field position to kick a field goal on 4th and 2. The stats say to go for it, but I doubt you'll ever find a coach (or announcer) that won't "take the points" and make it a 2 score game. Of course, kickers miss 37 yard field goals 25% of the time and 9 and a half minutes is plenty of time for an opponent to score twice. Which is exactly why the stats prefer to go for it, leaving the possibility of a touchdown open and also burning more time off the clock.

Remember when I said 9 and a half minutes is plenty of time for a team to score twice? Well it took all of 29 seconds for Michigan State to respond, scoring a touchdown in 2 plays and making the deficit 2 points. After a correct Michigan punt, Michigan State found themselves in familiar territory. 4th and long in the "gray zone" (what I like to call the area between the opponents 25 and 49 yard line). But they actually punted, a decision that is supported by the 4th down calculator.

The rest of the game was pretty boring, at least from a 4th down perspective. Michigan correctly punted on a 4th and 6, and Michigan State correctly went for it on a 4th and 19 (I bet a 4th and 3 on your own 11 yard line doesn't seem so bad now, does it?). Then, yeah, this happened.

Summary

Each week, I’ll summarize the times coaches disagreed with the 4th down calculator and the difference in expected points between the coach’s decision and the calculator’s decision. I’ll do this only for the 1st 3 quarters since I’m tracking expected points and not win probability. I also want to track decisions made on 4th and 1, and decisions made between midfield and the opponent’s 25 yard line. I’ll call this area the “Gray Zone.” These will be pretty sparse now, but will fill up as the season goes along. Then we can easily compare the actual outcomes of different decisions in similar situations.

Team Summary Team Number of Disagreements Total Expected Points Lost Northwestern 7 4.23 Minnesota 6 2.82 Indiana 4 2.74 Michigan 5 2.59 Michigan St 5 2.04 Penn St 3 2.01 Iowa 3 1.80 Nebraska 4 1.59 Illinois 3 1.19 Wisconsin 3 1.18 Rutgers 2 1.09 Ohio St 3 0.92 Purdue 1 0.24 Maryland 0 0

 

4th and 1

Yards To End Zone

Punts

Average Next Score After Punt

Go for It

Average Next Score after Go for it

Field Goals

Average Next Score After FG

75-90

1

7

0

0

*

*

50-74

11 0.64 3 4.67

*

*

25-49

0

0

4 2.75 1 -7

1-24

*

* 6 2.83 2 3

 

The Gray Zone (4th downs 25-49 yards to the end zone)

4th Down Distance

Punts

Average Next Score After Punt

Go for It

Average Next Score after Go for it

Field Goals

Average Next Score After FG

1

0

0

4 2.75 1 -7

2-5

10 1.7 9 -0.89 2 -2

6-9

13 1.15 6 -3 4 2.5

10+

14 0.14 1 7 6 1.5

 

The Longest Drive: Golf and Design of Experiments, Part 5

Image may be NSFW.
Clik here to view.
In Part 3 of our series, we decided to test our 4 experimental factors, Club Face Tilt, Ball Characteristics, Club Shaft Flexibility, and Tee Height in a full factorial design because of the many advantages of that data collection plan. In Part 4 we concluded that each golfer should replicate their half fraction of the full factorial 5 times in order to have a high enough power to detect improvement in the drive distance.

Now we're ready for Step 4 in our DOE problem-solving methodology: analyzing the data. Analysis and interpretation of the covariates and blocking variables will be an important part of our experiment to maximize drive distance.  

For each drive, we recorded the experimental factor settings, club speed, club / ball contact efficiency, and the golfer. Our analysis will focus first on removing the effect of the last three noise variables from the data so that we can get a clearer look at our 4 research variables. This post will focus on Analysis of Covariance (ANCOVA) and blocking variables.

Analysis of Covariance

A covariate is a continuous noise variable that has an impact on your response, but is not of research interest. Two quick examples:  

  1. When comparing the weight loss from two diet plans, the % body fat of each subject at the start of the experiment is a covariate because the pounds lost will be higher for those with a high initial % body fat for both diet plans.  

  2. When studying the effects of process settings of a water treatment for removing solids from waste water, the turbidity (cloudiness) of the treated water will be a function of the incoming water turbidity regardless of the treatment process.

Analysis of covariance removes the impact of covariates from the data so you can determine the effects of the experimental factors. Consider the scatterplot below, which shows two golfers’ Carry distance vs the club speed for that drive. There is a great deal of variability in the response around the regression line because of the variety of experimental factor settings used for each drive. Still, the impact of the club speed can be seen within this variability. The regression line for the impact of the club speed is:

                Image may be NSFW.
Clik here to view.

The average club speed is 89.6 miles per hour (mph). Essentially, ANCOVA adjusts each data point down about 1.055 yards for each mph the club speed is above 89.6 mph and adjusts up for data points below the average of 89.6 mph. The final experiment analysis will be on the adjusted data points, not the original Carry values. By doing this, we can measure the effects of our 4 input variables while controlling for club speed and club / ball contact efficiency, our two covariates.

Image may be NSFW.
Clik here to view.

I have used ANCOVA to study the effects of process parameter settings on the final flatness of silicon disks from a machining process, while controlling for the flatness of the blanks coming into the process. I have also used ANCOVA to learn the effects of curriculum characteristics on the test scores of high school students, while controlling for the students’ past academic performance and years of schooling of their parents. It is a powerful technique that allows us to make apples-to-apples comparisons with data that would otherwise bias the results.

This analysis can be done in Minitab by simply double-clicking the appropriate column(s) into the Covariate dialog box. It’s just that easy.

Blocking Variables  

A blocking variable is a categorical noise variable that affects your response but is not of research interest. The dotplot below compares the Carry distance of each of the four golfers in our study. It shows a great deal of golfer-to-golfer variability even though they each tested equivalent settings of all four experimental factors. How are we going to combine this data and analyze it as one experiment? The answer is simple—blocking.

Image may be NSFW.
Clik here to view.

We designed our data collection to block on golfer to ensure that each golfer would be testing equivalent combinations of the 4 factors. This is called balance. Based on this design, Minitab will also know to include golfer in the analysis.

There are four key benefits to blocking in your experiment design:  

  1. The effect of the blocking variable at each level (in this case, our four golfers) will be estimated by the average response at that level, and then all the data will be standardized according to the golfer. Standardization essentially removes the effect of the blocking variable so that all the data can be analyzed as one experiment. This is similar to analysis of covariance, the major difference being that the data collection can be designed to be balanced with respect to the blocking variable, whereas this isn’t always possible with covariate(s).
  1. Blocking allows you to complete the data collection more efficiently by allowing the use of all resources (3 machines, 4 batches, 2 gages, etc.) without concern over adding variability to your results. The variability will be accounted for by attributing it to the blocking variable, thus preventing it from negatively affecting your results.  
  1. Your results will be applicable over a wider range of conditions. Our results will apply to the range of golfers’ styles and abilities. We will also learn the relative importance of the blocking variable compared to the research variables. I saw this benefit benefit of blocking when I was working with a manufacturer of ultrasonic transducers who was having difficulty meeting a flatness specification, a key quality characteristic. We decided to run a 3-factor full factorial on each of the three production machines instead of just one. In the end, we found the three process variables had no effect on flatness: only one of the machines was causing the poor flatness results because of a fixture issue. Good thing we decided to block!
  1. Blocking accounts for the variability caused by the blocking variable, therefore reducing the error term used in all hypothesis tests. This increases the power of the test to detect a significant effect. Consider the ANOVA table below comparing the cure time for three polymer formulations tested over several batches of raw material:

Image may be NSFW.
Clik here to view.

The p-value for the F-test comparing the three formulations is 0.164, which means this test failed to detect that one of the formulations is different. This is based on an average error estimate that includes the batch-to-batch variation. If the batch-to-batch variation is separated from the error estimate by including the blocking variable (Batch) in the analysis, the ANOVA table is:

                          Image may be NSFW.
Clik here to view.

The p-value for the F-test comparing the three formulations is now 0.031, so we reject the null hypothesis that all three formulations have the same mean cure time and conclude that at least one formulation is different. We also learn that the batch-to-batch variation has roughly the same impact on cure time as the formulations we were studying.

I hope I've made a strong case for the importance of including a blocking variable in the design and analysis of your DOE. Many a designed experiment has been made or broken based on the researcher’s planning for categorical noise variable(s) that would impact their response.

Summary and Forecast

The two covariates, club speed and club / ball contact efficiency, as well as the blocking variable (Golfer), will be included in our final analysis, giving us a clearer assessment of the effects of our four experimental factors on our response, Carry distance. Next week we will review the results!

I am forecasting that our results will be full of unexpected interactions. How do I know this? If the final answer was just main effects, somebody would have already solved this problem. Main effects are easy. The really tough problems almost always involve interactions. That’s what makes them so tough!

Previous Golf DOE Posts

The Longest Drive: Golf and Design of Experiments, Part 1

The Longest Drive: Golf and Design of Experiments, Part 2

The Longest Drive: Golf and Design of Experiments, Part 3

The Longest Drive: Golf and Design of Experiments, Part 4

Many thanks to Toftrees Golf Resort and Tussey Mountain for use of their facilities to conduct our golf experiment. 


Resources to Celebrate Healthcare Quality Week

Image may be NSFW.
Clik here to view.
This week is National Healthcare Quality Week, started by the National Association for Healthcare Quality to increase awareness of healthcare quality programs and to highlight the work of healthcare quality professionals and their influence on improved patient care outcomes.

In honor of the celebration, I wanted to point you to a few case studies featuring Minitab customers in the healthcare field who have been involved in quality improvement projects that have achieved great results. Kudos to not only those below, but all who work in healthcare to improve the experience of patients everywhere (thank you!):

Akron Children’s Hospital

The largest pediatric provider in northeast Ohio used Minitab to analyze their Lean Six Sigma project data, which helped them to decrease their rate of unplanned breathing tube removals for infants in the neonatal intensive care unit—preventing patient setbacks and reducing length of stay.

Cathay General Hospital

During an assessment of its angioplasty process for patients suffering from heart attacks, Cathay General Hospital in Taipei, Taiwan used Minitab to analyze data to help them introduce new treatment options that led to a decrease in the patients’ hospital stay and an increased savings in medical resources.

Riverview Hospital Association

With Minitab, the Riverview Hospital Association Lean Six Sigma team was able to perform data analysis to identify patient groups who were scoring lower on patient satisfaction survey questions. This allowed the team to target process improvement efforts to specific patient populations.

Franciscan Children’s Hospital

With the help of Lean Six Sigma and Minitab software, Franciscan Hospital for Children was able to analyze information about its processes and make data-driven decisions that increased dental operating room efficiency and enabled doctors to see more kids.

Monitoring Rare Events with G and T Charts

And don’t forget about the quality tools in Minitab Statistical Software for use in the healthcare industry—specifically the G and T Charts. These charts make it easy to assess the stability of processes that involve rare events and have low defect rates. You can read more about them in this post: http://blog.minitab.com/blog/michelle-paret/monitoring-rare-events-with-g-charts

Can Data Analysis and Statistics Help Reduce Medication Errors?

People who are ill frequently need medication. But if they miss a dose, or receive the wrong medication—or even get the wrong dose of the right medication—the results can be disastrous. 

Image may be NSFW.
Clik here to view.

So medical professionals have a lot of stake in making sure patients get the right medicine, in the right amount, at the right time. But hospitals and other medical facilities are complex systems, and mistakes do occur. 

The application of data analysis and statistics can help detect where errors are occurring so that effective improvements can be made, whether it's on the factory floor or the ICU. 

Is Every Patient Getting the Right Medication?

Suppose you work for a small hospital, whose staff administers medications to hundreds of patients every week. You want to make sure every patient gets the right amount of the right medication at the right time, but over the past 32 weeks, your hospital has seen 156 medication errors—obviously far too many. 

To make sense of your data, you turn to the Assistant in Minitab Statistical Software. (If you're not already using our software and you want to play along, please download the free 30-day trial.) 

Your data includes counts of the number of patients treated each week, and the number of medication errors that occur.  

Identify the Important Factors

To get better insight into the situation, you gather more comprehensive data on a random sample of 100 medication errors, including the type of error and the time it occurred.

Since it’s always a good idea to visualize your data, you select Assistant > Graphical Analysis.

Image may be NSFW.
Clik here to view.

You want to know what types of medication errors are the most frequent. The Assistant's decision tree will guide you to the right option. Since you want to see the defect types for the count data you’ve collected, the Assistant directs you to the Pareto Chart. 

Image may be NSFW.
Clik here to view.

Click the button and fill out the dialog box as follows:

Image may be NSFW.
Clik here to view.

The resulting chart shows that in nearly 75 percent of the incidents, patients either received too little medication or got their medication at the wrong time. 

Image may be NSFW.
Clik here to view.

Based on this knowledge, you and your team devise and implement process changes designed to help hospital staff give patients the proper dosage of medications and adhere strictly to the treatment times specified by their physicians.

Create a Before / After Control Chart

After the changes have been implemented, you gather additional data over several weeks to see whether errors have been reduced. Your data set includes counts of the number of patients treated each week, and the number of medication errors that occurred. Now your administrator wants to know what effect the changes had. 

You select Assistant > Before/After Control Charts… to create a chart to easily see if the changes made the desired impact.

Image may be NSFW.
Clik here to view.
http://www.minitab.com/uploadedImages/Content/Products/Minitab_Statistical_Software/Quick_Start/QS4-9-before-after-ctrl-chart-menu-EN.jpg

Because you have attribute data, and since each patient could be associated with more than one medication error, the Assistant's decision tree guides you to the U Chart. Complete the dialog box as shown:

Image may be NSFW.
Clik here to view.

This yields the following chart:

Image may be NSFW.
Clik here to view.

Your before/after control chart shows that the changes have had a significant impact on the amount of medication errors. The chart also shows that the new process is stable and in statistical control.   

Just What the Doctor Ordered

Even if you’re not a statistician, you can benefit from using statistical tools to look at your data.

With the Assistant, it's very easy to create a Pareto Chart to identify and focus your efforts on the most frequent medication errors. After you've implemented changes, the Assistant’s Before/After Control Charts make it easy to demonstrate that your improvements have significantly reduced the number of medication errors.

What improvements can you make next?

 

Using a Catapult as a Minitab Capability Sixpack Training Aid

By Matthew Barsalou, guest blogger

Teaching process performance and capability studies is easier when actual process data is available for the student or trainee to practice with. As I have previously discussed at the Minitab Blog, a catapult can be used to generate data for a capability study. My last blog on using a catapult for this purspose was several years ago, so I would like to revisit this topic with an emphasis on interpreting the catapult study results in the Minitab Statistical Software’s Capability SixpackTM. The catapult is can be used in various configurations, but here the settings will stay constant to simulate a manufacturing process. The plans and assembly instructions are available here.

The Catapult Study

The catapult used a 120 mm diameter heavy-duty rubber band originally intended for use in model airplanes. The rubber band guide was set at 4 cm and the arm stopper was set at 1 cm. The starting point was set at 8 cm and these settings were held constant for the duration of the study. Three operators each performed 2 runs of 20 shots each to simulate two days of production with three shifts per day. Each run was used a separate subgroup in the capability and performance study.

The capability indices Cp and Cpk use short-term data to tell us what the process sis capable of doing and the performance indices Pp and Ppk use long-term data to tell us what the process is actually doing. The capability indices use “within” variation in the formula, and performance indexes use “overall” variation; within variation is based on the pooled standard deviations of the subgroups and overall variation is based on the standard deviation of the entire data set.

There are requirements that must be met to perform a capability or performance study. The data must be normally distributed and the process needs to be in a state of statistical control. The data must also be randomly selected, and it needs to represent the population. There should be at least 100 values in the data set; otherwise, there will be a very wide confidence interval for the resulting capability and performance values. The person planning the study must ensure there is sufficient data and the data represents the values in the population; however, the Capability Sixpack in Minitab Statistical Software can be used to ensure the other requirements are fulfilled.

The figure below shows a Capability Sixpack for the catapult study.

Image may be NSFW.
Clik here to view.

The Capability Sixpack

The Capability Sixpack provides an I chart when the data consists of individual values; i.e. the subgroup size is 1. An Xbar chart is provided when the data is entered as subgroups. Either control chart can be used to assess the stability of the process. The process will need improvement to achieve stability if out of control values are seen in a control chart. The source of the variability in the process should be sought out and removed and then the study should be repeated.

A moving range chart is given when the subgroup sizes is 1, and an S chart is given when the subgroup size is greater than 1. The values in the moving range chart should be compared to the values in the I chart to ensure no patterns are present. The same should be done for the Xbar and S chart if they are used. This is done to help ensure the data are truly random. Either the last 25 observations or the last 5 subgroups will be shown. The last 25 observations are shown if the data is entered as 1 subgroup and the last 5 subgroups are shown if the data are entered as subgroups. The values should appear random and without trends or shifts if the process is stable.

A capability histogram is shown to compare the histogram of the data to the specification limits. The data should approximate the standard normal distribution. The line for overall shows the shape of a histogram using the overall standard deviation. The within line shows the shape of the histogram using the pooled standard deviation of the subgroups.

A normal probability plot is provided to assess the normality of the data. A p-value of less than 0.05 indicates the data is not normally distributed. Data that is not normally distributed can't be used in a capability study. You may transform non-normal data, or identify and remove the cause of the lack of normality. The better option is to improve the process so that the data is normally distributed. The Capability Sixpack can’t be used if the data hits a boundary such as 0 or an upper or lower limit; however, the regular capability study option can still be used as a checkmark is placed next to the boundary indicator beside the specification limit.

The capability plot displays the capability and performance of the process. The capability of a process is measured using Cp and Cpk, and both tell us what the process is capable of. They are intended for use with short-term data and use the pooled standard deviation of rational subgroups to tell us what the process is capable of. Rational subgroups use homogenous data so that only common cause variation is present. For example, parts may have all been produced on the same machine, using the same batch of raw material, by the same operator. The Cp compares the spread of the process to the specification limits; a process with a high Cp value may produce parts out of specification if the process is off-center. The Cpk considers position of the process mean relative to specification limits and there are actually two values for Cpk, the Cpk of the upper specification limit and the Cpk of the lower specification limit. The Capability Sixpack lists the value of the worse performing of the two Cpk values.

The performance of a process is measured using Pp and Ppk with long-term data. Generally, more than 30-days' worth of production data should be used for Pp and Ppk. Unlike the capability indices Cp and Cpk, Pp and Ppk calculations are performed using the total standard deviation, which is the same as the formula for a sample standard deviation. The Pp compares the spread of the process to the upper and lower specification limits and only the worse performing value is given. The Ppk considers position of the process mean relative to specification limits.

The process capability index of the mean is the Cpm, which uses a target value to account for the process mean relative to the target. However, this is only given if a target value is entered in Minitab.

Conclusion

The Minitab Capability Sixpack will quickly and easily provide a capability study; however, it will not directly tell you if the data is unstable for a capability study. It does, however, provide methods for assessing the suitability of the data and they should be used every time a capability study is performed.

 

About the Guest Blogger

Matthew Barsalou is a statistical problem resolution Master Black Belt at BorgWarner Turbo Systems Engineering GmbH. He is a Smarter Solutions certified Lean Six Sigma Master Black Belt, ASQ-certified Six Sigma Black Belt, quality engineer, and quality technician, and a TÜV-certified quality manager, quality management representative, and auditor. He has a bachelor of science in industrial sciences, a master of liberal studies with emphasis in international business, and has a master of science in business administration and engineering from the Wilhelm Büchner Hochschule in Darmstadt, Germany. He is author of the books Root Cause Analysis: A Step-By-Step Guide to Using the Right Tool at the Right TimeStatistics for Six Sigma Black Belts and The ASQ Pocket Guide to Statistics for Six Sigma Black Belts.

The Longest Drive: Golf and Design of Experiments, Part 6

In Part 5 of our series, we began the analysis of the experiment data by reviewing analysis of covariance and blocking variables, two key concepts in the design and interpretation of your results.

Image may be NSFW.
Clik here to view.
A web of correlations can help or hurt you
The 250-yard marker at the Tussey Mountain Driving Range, one of the locations where we conducted our golf experiment. Some of the golfers drove their balls well beyond this 250-yard maker during a few of the experiment runs.

I realize that many of you are probably thinking, “Five blog posts and he hasn’t even told us what happened?” When you are solving a manufacturing or development problem you might hear the same thing from your leadership. "When will we get the results?" Never fear, we are right on schedule. According to Doug Montgomery, who happens to be a pretty experienced experimenter, “If you have 10 weeks, 8 of them should be spent planning and designing the data collection, 1 week to execute the experiment, and 1 week to analyze the results.” That’s the schedule our experiment followed, only it’s not going to take us a week to analyze. With Minitab, I rarely spend more than a day!

So it is time to review and present our results. In my experience, the researcher has three typical goals—quantify, understand, and optimize. Different projects will prioritize these three goals differently, emphasizing one or another, but we are typically interested in all three. Today’s blog will look at the analysis and interpretation of our golf results with respect to all three.

Quantify

Analysis of variance was used to develop a model for Carry distance as function of our 4 experimental variables and their 6 potential two-way interactions. Let’s focus first on our two fastest-swinging golfers because their swing characteristics were very similar. After removing insignificant terms, the ANOVA table and resulting equation can be seen below for the reduced model:  

Image may be NSFW.
Clik here to view.

The ANOVA table above highlights a few concepts from our last blog. We see that the two covariates are statistically significant and to account for this, each drive has been adjusted for the club speed and club/ball contact efficiency. On the other hand, between our two highest swing-speed golfers, Sean and Andy, there is no real block effect as indicated by the high p-value = .478. We also see that by including each of those variables in the analysis, the sum of squares variation associated with those three noise variables are accounted for in the ANOVA table and that variability does not end up in the unexplained variation at the bottom of the table called Error. This reduces the error term in our F-tests for the significance of each effect, which then increases the power of the test.

The ANOVA table also points to some of the stronger effects within the list of statistically significant effects. This is indicated by the size of the F statistic, which is the ratio of the effect size to the error variation. The larger F statistic for the main effect of Ball (18.01) and the Tilt by Shaft interaction (21.94), indicate these are two of the dominant effects on Carry distance. 

The regression equation quantifies each effect so that the golfer can make a good decision on which drive settings they will incorporate in their game. For example, the coefficient for Ball Characteristics is 3.75, which means that the difference in yardage between the economy ($20 / dozen) and expensive ($50 / dozen) level of Ball is 3.75 x 2 = 7.5 yards. In short, our golfers have a business decision to make: whether to spend $30 for an average improvement of 7.5 yards greater drive distance. This will make good sense for some golfers and won’t for others, but once you quantify the effects, you have the information you need to make the right business decision for your situation.

In an industrial setting, you might find that a more expensive welding rod and slower weld travel produces a stronger metal bond. Based on your regression equation, you can make a good business decision about the cost for the welding rods and how many fewer parts per hour you will produce in exchange for the improvement in weld strength needed to meet your customer’s specifications. Everyone’s process is different, but quantifying the effects of process variables with your regression equation(s) allows you to determine the settings needed to improve your process in order to meet your customer’s specifications.

Image may be NSFW.
Clik here to view.
One of the experiment’s golfers, Ben Farrell, at the tee driving his ball as part of an experimental run. Our golf experiment used the Ernest Sports launch monitor to see club speed and club/ball contact efficiency for each of the golfers’ drives.
Understand

From a first principles perspective, it is important to understand why the regression coefficients and the effects turned out the way they did. Our expanded understanding of the process allows us to move to new variable settings, which improve process results. It also informs us about other variables we need to control to consistently maintain that improved performance. Consider the Club Face Tilt * Shaft Stiffness interaction, a key contributor to Carry distance, shown in the interaction plot below:

Image may be NSFW.
Clik here to view.

The plot shows that at a high club face tilt (10.5 degree), Carry improves if the club shaft is stiff at 306 vibrations per cycle, but the higher tilt causes a decrease in Carry if the less stiff shaft is used. This makes sense when we remember that Andy and Sean were our two highest club-speed golfers. As club velocity increases, the shaft bends away from the ball, and this changes the angle the club face presents to the ball at the bottom of the swing, independent of the tilt built into the club face. So we should expect that these two factors would interact.

Our experiment demonstrates the impact of this phenomenon on Carry distance and helps us better understand the first principle’s science behind getting that ball to travel further. Based on this understanding, perhaps we would benefit from an even stiffer shaft and higher club face tilt, or an even lower stiffness and lower club face tilt.   

Image may be NSFW.
Clik here to view.
Andy Fouse, another golfer who took part in our experiment, driving his ball as part of an experimental run. Andy’s club speed was among the highest of the four golfers in our experiment.

Understanding your process is key. I recall experimenting with a glass grinding operation in an effort to increase the glass removal rate while maintaining a smooth glass finish. We studied grit size, part rpm, grinding wheel velocity, grinding pressure, and grinding media density. Everyone was surprised to learn that the surface finish quality was only a function of grit size, which left us with complete freedom to maximize glass removal by focusing on the other four process parameters. Our improved understanding of the process led the way to us meeting our goals.

Optimization

The response optimizer and contour plot are two of my favorite tools for finding the optimal process settings. However, we only have one response and four important factors to consider, so in this case I think the cube plot fits the job. A cube plot of the average Carry at all 16 combinations of our 4 experimental factors is shown below.

Image may be NSFW.
Clik here to view.

Based on this straightforward representation of the data, we can determine the conditions that produced the lowest and the highest response:

                Ball = Economy, Tee Height = 1, Tilt = 10.5, Shaft = 291 ……………. Carry = 220 yards

                Ball = Expensive, Tee Height = 1.75, Tilt = 8.5, Shaft = 291 ………… Carry = 265 yards

                Ball = Expensive, Tee Height = 1.75, Tilt = 10.5, Shaft = 306 ……….. Carry = 262 yards

Based on these results, we conclude that we can improve our performance up to a maximum of 45 yards (20%) with an average improvement of 20 yards (8.2%). This is achieved by just using the process settings that have been optimized based on our experiment results, as opposed to the old way of driving off the tee.

Now I have a quick question for you: What would be the dollar value to your company if the process you are working on started to produce 8% more product each day, just by changing the process parameter settings? (If the number doesn’t have at least six figures, you just committed a Type III error—you have been working on the wrong process!)

Wrapping Up and Future Direction

In wrapping up, we need to remember the impact of noise variables in our experiment. The scatterplot of Carry distance vs. Club Speed, Club/Ball Contact Efficiency, and Golfer shown below reminds us of what we already knew. The noise variables dominate the response.

The data is so strongly influenced by the noise variables that we might decide that you just need to learn to swing harder. This is true, but in real life it is not always possible. After all, they are noise variables! 

Image may be NSFW.
Clik here to view.

Luckily, the results of our experiment are independent of noise variables. The results are the conditions that maximize drive distance given the fact that club speed might be limited to an average of about 90 mph. In this way, often our results don’t guarantee a certain response level, but allow us to do the best we can with a poor raw material supply, a humid day, a dull cutting tool, etc. This will be in addition to running optimally when the noise variables are more favorable as well.

So what’s next? Even though we have reached a very positive endpoint, there are still some unanswered questions. What is the effect of ball spin? Can we move to even better factor settings? What about club weight? Most good experiments lead to another experiment. I hope to be back on the blog again soon to answer these questions!

Many thanks to Toftrees Golf Resort and Tussey Mountain for use of their facilities to conduct our golf experiment. 

Previous Golf DOE Posts

The Longest Drive: Golf and Design of Experiments, Part 1

The Longest Drive: Golf and Design of Experiments, Part 2

The Longest Drive: Golf and Design of Experiments, Part 3

The Longest Drive: Golf and Design of Experiments, Part 4

The Longest Drive: Golf and Design of Experiments, Part 5

Big Ten 4th Down Calculator: Week 5

Image may be NSFW.
Clik here to view.
Nebraska lost another close game, teams continue to incorrectly punt to Ohio State on 4th and short, and Northwestern keeps making terrible 4th down decisions. Another regular week for the Big Ten 4th down calculator.

In case you haven't read the earlier entries in this series, I've used Minitab Statistical Software to create a model to determine the correct 4th down decision in Big Ten Conference games. And for the rest of the college football season, I'll use that model to track every decision. However, the calculator isn’t meant to provide decisions written in stone. In hypothesis testing, it’s important to understand the difference between statistical and practical significance. A test that finds a statistically significant result doesn’t imply that your result has practical consequences. You should use your specialized knowledge to determine whether the difference is practically significant.

Apply the same line of thought to the 4th down calculator. If you're the coach, you should also consider other factors. But the 4th down calculator still provides a starting point for making the decision. 

I break the analysis for each game into two sections: 4th down decisions in the first 3 quarters, and decisions in the 4th quarter. In the first 3 quarters, coaches should try to maximize the points they score. But in the 4th quarter, they should maximize their win probability. To calculate win probability, I’m using this formula from Pro Football Reference.

Now on to the games!

Ohio State 49 - Rutgers 7

Ohio State had 6 consecutive drives that ended in a touchdown. The Buckeyes appear to have hit their stride with J.T. Barrett as quarterback.

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Ohio State

2 0 1 0 1 0

Rutgers

8 3 6 1 1 1.6

Ohio State's offense was so good that they only had two 4th downs. And yet again, they correctly went for it on 4th and 1, converted, and scored a touchdown on the same drive. Ohio State is tough enough as it is, but with Urban Meyer making optimal 4th down decisions, they're going to be next to impossible to beat. Especially if their opponents keep making decisions that are...how should I put this nicely, Rutgers...not optimal.

To their credit, Rutgers started their 4th down decision making very well. On their opening drive, they had a 4th and 1 on the Ohio State 18 yard line. They correctly went for it and picked up the first down. But 3 plays later, they had a 4th and 6 on the Ohio State 11 yard line. The model says to go for it, but it's very close, as the difference is less than a tenth of a point. So Rutgers decision to kick a field goal wasn't that bad. Although the outcome was bad, as their kicker missed 29 yard field goal, a kick Big Ten kickers make about 86% of the time.

Things went downhill for Rutgers midway through the 2nd quarter. Trailing only 7-0, they had a 4th and 2 at the Ohio State 40 yard line and punted. This is a terrible decision even against an average Big Ten team, and we know Ohio State isn't your average Big Ten team. Rutgers was a 21-point underdog and Ohio State has one of the best offenses in the Big Ten. The statistics say Rutgers lost 0.94 points by punting, but it was even worse when you consider the other factors that the calculator doesn't know. And sure enough, it took Ohio State all of 4 plays to drive 80 yards and score a touchdown.

Then, on the very next possession, the Scarlet Knights punted on 4th and 1 at their own 30. Hey Rutgers, did you not just see what punting on 4th and short did for you the previous possession? Sure, if you don't get the first down Ohio State has great field position. But their offense is so good, it doesn't matter where they start their drive. If you're going to beat this team, you need to score points. And just like before, after the punt it took Ohio State all of 4 plays to drive 65 yards and score a touchdown.

If any big underdog is going to upset Ohio State, they're going to have to do it by being aggressive on 4th down. Sure, if you fail on most of them you're going to get blown out. But look at the final score of this game: Rutgers "played it safe" and got blown out anyway. In fact, their two bad 4th down decisions directly led to Ohio State touchdowns that broke the game open. Please, Big Ten teams—stop willingly giving Ohio State possession of the football on 4th and short. If you're going to pull the upset, you're going to have to do so by scoring points, not punting the ball.

Wisconsin 24 - Illinois 13

Illinois lost a relatively close game. I hope they didn't leave any points on the field with poor 4th down decision making.

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Wisconsin

5 0 4 1 0 0 Illinois 7 4 5 2 0 2.03

Wow, through the first 3 quarters Illinois had seven 4th downs, and made the incorrect decision on 4 of them! Are we sure Tim Beckman was fired?

The first decision we can let slide. On a 4th and 2 from their own 25 yard line they punted. The calculator will always say to go for it on 4th and 2 in your own territory, but the difference in expected points is only 0.19, so punting isn't a terrible decision. But then things got bad for the Illini. On a 4th and 4 from the Wisconsin 7, they kicked a field goal instead of going for it. Because it was 4th and 4, this might not seem like a terrible decision, but it actually is. They lost 0.71 points by kicking the field goal. The reason going for it is such a strong decision is because even if you fail, the other team starts their drive inside their own 10 yard line. So the defense is actually more likely to be the next team to score. Inside the 10 yard line, being aggressive is really a win/win.

In the 3rd quarter Illinois punted on 4th and 1 two different times. They were in their own territory both times, but teams convert on 4th and 1 so often that the benefits outweigh the risks. And honestly, punting the ball to the other team is a risk too! Illinois's second 4th and 1 punt happened with 2 minutes left in the 3rd quarter. At the time, they were trailing by 4 points. But after they gave Wisconsin the ball, the Badgers went on a soul-crushing 7-minute, 39-second touchdown drive, making it a two-possession game. The next time Illinois had a 4th down, it was 4th and 12 with 3:28 left in the game and they were trailing by 11 points.

Wonder if they wish they could change their decision on that previous 4th and 1?

Michigan State 52 - Indiana 26

All season I've been saying Indiana should never punt. It's a little tongue in cheek, as obviously there are some times they should punt. But the combination of a great offense and a horrible defense means Indiana should by far be the most aggressive team on 4th down in the Big Ten. So how did they do this week?

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Indiana

4 1 3 1 0 0.56 Michigan St 6 1 4 1 1 0.14

Indiana only had four 4th downs in the first 3 quarters of the game. The calculator disagreed with only 1 decision, but when you consider that it's Indiana, I think that number becomes 3.

Indiana's first punt was on 4th and 10 at their own 25 yard line. No problems there. But their next drive, they punted on 4th and 5 from the Michigan State 44. The calculator says to punt, but the difference between punting and going for it is only 0.05 points. The calculator assumes you only gain 5 yards if you convert the 4th down. But in reality, you'll usually gain more yards, which strengthens the decision to go for it. And on top of that, remember that Indiana has a great offense and a terrible defense. They absolutely should have gone for it here.

The final score was lopsided, but this was actually a much closer game than the final score indicates. Midway through the 3rd quarter, Indiana found itself only trailing by 2. They had a 4th and 1 at their own 27, and they decided to punt. The calculator's feelings on punting on 4th and 1 have been well documented. But Indiana got the ball back still only down 2, and drove to the Michigan State 24 yard line, where they had a 4th and 4. The calculator says to kick the field goal, but the difference is only 0.15 points. And again, this is Indiana. Even if you make it, do you really think a 1 point lead is going to hold up with your defense? They needed to be aggressive and play for the touchdown. But instead they kicked, and ended up missing the field goal.

Before we get to the 4th quarter, I want to quickly mention the Michigan State 4th down decision that the calculator disagreed with. They went for it on 4th and 6 from the Indiana 27 when the calculator says to go for it. But you'll see in the table above that the difference is only 0.14 points. And have I mentioned that the Indiana defense is bad? So there is really no issue with the decision here. Plus, Spartan coach Mark Dantonio has been consistently making this type of decision all season. This is the fourth time Michigan State has gone for it when the calculator suggests kicking a field goal. And it worked out well for him here, as they converted and scored a touchdown on the drive.

4th Down Decisions in the 4th QuarterTeamTime Left4th Down DistanceYards to End ZoneCalculator DecisionCoach DecisionWin Probability Go For ItWin Probability Kick Michigan St
(Up by 2) 12:44 3 3 Go for it FG 88.4% 86.6% Indiana
(Down by 5) 9:19 11 58 Punt Punt 8.8% 10.2% Indiana
(Down by 12) 3:52 5 56 Go for it Go for it 0.1% 0.04%

Between their opponents' 25 and 35 yard line, Michigan State has been aggressive with their 4th down decisions all season. But that aggressiveness didn't translate to the goal line, as the Spartans kicked a field goal on their first possession of the 4th quarter instead of going for it. The field goal pushed their lead from 2 points to 5 points, so Indiana still had a chance to take lead with a touchdown. And that's why Michigan State should have gone for it. With the Hoosier offense being so good, the Spartans really should have tried to make this a 2-possession game. And even if they failed, Indiana would have to start their drive at their own 3 yard line. It's a win/win.

Luckily for Michigan State, the Hosiers found themselves in a 4th and 11 on their next drive. The calculator agreed with the coach's decision to punt, but the statistics don't know how bad Indiana's defense is. It's hard to quantify how much that would affect the numbers, but it would definitely make win probabilities for kicking and going for it closer. Even so, it's hard to fault Indiana for punting on 4th and 11, as teams convert that distance only 27% of the time. So really, either decision would have been okay here.

Unfortunately for Indiana, the decision to punt backfired.s Michigan State went on a 4-and-a-half-minute touchdown drive, extending their lead to 12 points. Indiana then correctly went for a 4th and 5, although at that point their chances of winning were almost 0. And when the 4th down pass fell incomplete, any chance of an Indiana upset vanished completely.

Penn State 31 - Maryland 30

This was a pretty crazy game. Early on, it looked like neither team would be able to score. Then we had 4 straight possessions that ended in touchdowns. Then the next 8 possessions after that resulted in a total of 3 points. Crazy.  

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Penn State

6 1 5 1 0 0.19 Maryland 5 1 2 3 0 0.04


From a 4th down decision making perspective, this game was actually pretty boring through 3 quarters. Most of the decisions were pretty cut and dry. And the two disagreements with the calculator weren't really bad decisions at all. Maryland kicked a field goal on 4th and 5 from the Penn State 26 when the calculator would have gone for it. But the difference in expected points is only 0.04, so really either decision was fine.

Then on their first possession of the 2nd half, Penn State punted on 4th and 2 from their own 35. The stats say to go for it, but again the difference between kicking and going for it is pretty small. So the decision to punt really isn't too bad. And this punt led to 4 straight possessions that ended in a touchdown, bringing us to a wild 4th quarter.

4th Down Decisions in the 4th QuarterTeamTime Left4th Down DistanceYards to End ZoneCalculator DecisionCoach DecisionWin Probability Go For ItWin Probability Kick Maryland
(Down by 4) 11:49 6 18 Go for it FG 33.9% 30.8% (FG) Maryland
(Down by 4) 10:15 11 11 FG FG 31.3% 32.2% (FG) Penn State
(Up by 1) 9:42 10 28 FG FG 63.8% 66.7% (FG) Maryland
(Down by 1) 7:51 2 64 Go for it Go for it 37.7% 34.9% (Punt) Maryland
(Down by 1) 3:05 10 45 Punt Go for it 35% 43.1% (Punt) Penn State
(Up by 1) 1:21 6 58 Punt Punt 47.5% 60.7% (Punt)

This game featured the most 4th down decisions in the 4th quarter that we've seen this year. It started with Maryland kicking a field goal down by 4 on a 4th and 6. Going by expected value, this is the correct decision. But when you're using win probability, the Terrapins should have went for it. The reason of course being that they were down 4, so a field goal didn't change the fact that they were still losing. Luckily for Maryland, the Nittany Lions roughed the kicker, and they got a first down anyway. But they weren't able to do anything with it, as 3 plays later they had a 4th and goal from the 11 yard line. This time kicking was the correct decision, since teams convert on 4th and goal from the 11 only 20% of the time. The possibility of a touchdown was so small, that taking the points was the correct decision, despite the fact they were still losing anyway.

After the teams exchanged fumbles on back-to-back plays (I told you, this game was crazy), Penn State correctly attempted a 45-yard field goal on 4th and 10. But they missed it, leaving the door open for Maryland to win with a field goal of their own. On their next possession, Maryland faced a 4th and 2 from their own 36 yard line. I feel like most coaches would punt in this situation, but to Maryland's credit they correctly went for it. They didn't convert, but after another Penn State fumble, they got the ball back at midfield. Maryland quickly found itself in 4th and long with 3 minutes left. And it's here that the 4th down calculator really shocked me.

It suggested to punt.

Lots of things to consider here. First, teams only convert 4th and ten 28.4% of the time. So a big part of this decision stems from the fact that Maryland wasn't likely to convert. Second, they were only down by 1 point. If they punted and stopped Penn State, they would likely get the ball back around midfield. They wouldn't need many yards to get into field goal range. If they failed to convert and stopped Penn State, they would likely get the ball back deep in their own territory. That's a lot more yards that they would need to get into field goal range.

So if I were Maryland's coach, I would have thought about it like this: I have a 28.4% chance of getting this first down. Do I think my chances of getting a 3 and out from Penn State are greater? Because one Penn State first down would end the game. So if I don't think I can stop the Penn State offense, there is no reason to punt since it ends the game. But if I like my chances of getting a stop, then I'd much rather have the ball close to midfield than deep in my own territory. Personally, without looking at the numbers my first instinct was that you have to go for it here. But the difference in win probability is pretty large. And considering teams usually get very conservative and just run the ball late in games with the lead, on second thought I would have taken my chances punting.

Maryland decided to go for it and they failed to convert. However, they did get the 3 and out from Penn State that they needed. Maryland had one final shot to win the game, starting at their own 25 yard line with 1:15 left in the game. But, their first play was an interception that sealed the game. Could things have worked out differently if that drive had started from midfield? Possibly. But of course, hindsight is 20/20.

Northwestern 30 - Nebraska 28

Stop me if you've heard this one before, but Nebraska lost a close football game on Saturday.

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Northwestern 5 1 4 1 0 1.55 Nebraska 6 1 4 2 0 0.19

Northwestern continues to lead the Big Ten in puzzling 4th down decision-making. On Saturday, they made the worst 4th down decision a coach can make (once you rule out things like going for it on 4th and 20 from your own 10). On 4th and goal from the Nebraska 1, they kicked a field goal. That single decision cost them over a point and a half. On average, teams convert on 4th and goal from the 1 about 59% of the time. Multiply that by 6.96 (to account for the 4% of times teams miss the extra point), and you get an expected value of 4.1 points. Last time I checked, that's greater than the 3 points you get for a field goal. But it gets even worse once you consider that if you fail, the other team starts with the ball on the 1 yard line. So even then, you're more likely to be the next team to score. Failing isn't really failing at all! It amazes me how Northwestern is the "smart" school, yet each week they continue to make terrible 4th down decisions.

Nebraska's lone folly in the 4th down decision making department was punting on 4th and 3 from their own 45 yard line. The stats say to go for it, but the difference between going and punting is pretty small. So no problem with the decision. Now let's move on to the 4th quarter

4th Down Decisions in the 4th QuarterTeamTime Left4th Down DistanceYards to End ZoneCalculator DecisionCoach DecisionWin Probability Go For ItWin Probability Kick Northwestern
(Down by 2) 13:53 10 75 Punt Punt 20% 23.6% (Punt) Nebraska
(Up by 2) 12:41 3 74 Punt Punt 62.8% 63.1% (Punt) Nebraska
(Down by 5) 8:42 20 85 Punt Punt 9.6% 16.2% (Punt) Northwestern
(Up by 5) 7:27 6 9 FG FG 85% 85.5% (FG) Nebraska
(Down by 8) 5:29 6 40 Go for it Go for it 11.1% 8% (Punt)


Five 4th down decisions in the 4th quarter, and 5 correct calls. Impressive! It started with an obvious punt from Northwestern on 4th and 10. Then it continued with a not so obvious punt by Nebraska on 4th and 3. The stats say to punt, but the difference in win probability is only 0.3%. When the difference is that close, you should consider the game situation. If you're losing, a heavy underdog, or have a really bad defense, you might want to consider an aggressive strategy and go for it. But none of those thing applied to Nebraska, so kicking was the correct decision.

And what a kick it was! The punt traveled 63 yards, and after a Northwestern penalty the ball was placed all the way back at the Northwestern 8 yard line. Then, of course, Northwestern proceeded to drive 92 yards for the touchdown. That's just Nebraska's luck. Then, after Nebraska punted on a 4th and 20 (obviously the correct call), Northwestern made a very interesting decision. On a 4th and 6 from the Nebraska 9 yard line, they kicked a field goal. But again, you'll see the win probabilities are very close. So is there any other information that should have tipped this decision one way or the other?

Yes there is.

Teams convert on 4th and 6 about 37% of the time. That's the number the calculator uses. However, if it's 4th and goal from the 6, then that probability drops to 29%. This wasn't 4th and goal, but because the ball was at the 9 yard line it was probably more similar to 4th and goal from the 6 than a typical 4th and 6 from further back on the field. If we use a 4th down conversion rate closer to 29%, then the decision to kick becomes an obvious one. 

Our last 4th down decision came on a situation similar to that we saw in the Maryland/Penn State game. Losing late in the game with the ball around midfield, both Maryland and Nebraska faced 4th downs. The calculator told Maryland to punt, but advised Nebraska to go for it. What gives?

Well, one of the big differences is the distance. Nebraska's 4th and 6 has a higher conversion rate than Maryland's 4th and 10. The other is the score. Maryland was only down by a single point, needing just one possession to win the game. Nebraska was down by 8, needing...well we don't know how many possessions they need. It's like Schrodinger's cat—Nebraska is simultaneously losing by both 1 possession and 2 possessions, and only when Nebraska attempts a 2-point conversation will we know which one it is. Too often coaches act like an 8-point deficit is a one-possession game. But you really need to plan for the fact that you might need two possessions, and that's why Nebraska was correct to go for it here. They converted and ended up scoring a touchdown on the drive. However, they missed the two point conversion. And after 3 Northwestern first downs, the clock expired, and Nebraska found itself yet again on the losing side of a one-score game.

Summary

Each week, I’ll summarize the times coaches disagreed with the 4th down calculator and the difference in expected points between the coach’s decision and the calculator’s decision. I’ll do this only for the 1st 3 quarters since I’m tracking expected points and not win probability. I also want to track decisions made on 4th and 1, and decisions made between midfield and the opponent’s 25 yard line. I’ll call this area the “Gray Zone.” These will be pretty sparse now, but will fill up as the season goes along. Then we can easily compare the actual outcomes of different decisions in similar situations.

Team Summary Team Number of Disagreements Total Expected Points Lost Northwestern 8 5.78 Indiana 5 3.3 Illinois 7 3.21 Minnesota 6 2.82 Rutgers 5 2.69 Michigan 5 2.59 Penn State 4 2.2 Michigan State 5 2.18 Iowa 3 1.8 Nebraska 5 1.78 Wisconsin 3 1.18 Ohio State 3 0.92 Purdue 1 0.24 Maryland 1 0.04

Northwestern coach Pat Fitzgerald has opened up a sizable lead as the worst 4th down decision maker in the Big Ten. At this point, Northwestern's almost left an entire touchdown on the field. But how has the real-life outcome compared to what we would expect? In the 8 decisions where Fitzgerald disagreed with the calculator, the total expected points for the decisions Fitzgerald made was 11.6 points. Northwestern has actually scored a net of 13 points after those decisions. So the real outcome is pretty close to what we would expect. As for the decisions the calculator said Fitzgerald should have made, that would have given him a total of 17.4 expected points. Obviously we can't know how those decisions would have played out in real life, but the expected number is greater than the actual number of points Northwestern has scored. So we can say that in both theory, and in reality, Northwestern is leaving points on the table with their 4th down decision making. 

4th and 1

Yards To End Zone

Punts

Average Next Score After Punt

Go for It

Average Next Score after Go for it

Field Goals

Average Next Score After FG

75-90

2

7

0

0

*

*

50-74

14 -0.71 3 4.67

*

*

25-49

0

0

5 3.6 1 -7

1-24

*

* 7 1.429 3 3

Surprisingly, we've had only two 4th and 1s inside a team's 25 yard line all year. Both times, the offense punted and was the next team to score a touchdown. Things haven't gone as well for the teams that punt on 4th and 1 between their own 25 yard line and midfield. Their opponent has been the next team to score, with an average of 0.71 points. Sadly, only 3 teams have gone for it on 4th and 1 behind midfield even though that's the optimal decision. Two of those cases were by Ohio State, who converted and scored a touchdown on the drive. The third instance was Indiana, who converted but ended up punting later in the drive. Nobody scored in that game before halftime, giving us an average of 4.67 points for teams that went for it.

We see that teams become much more aggressive once they cross midfield. Out of the sixteen 4th and 1s in opponent territory, teams have made the optimal decision on 12 of them. Hooray! And part of me is secretly happy that the team who kicked a 42-yard field goal on 4th and 1 ended up missing it and having their opponent score a touchdown next. Wondering who team that was? You guessed it. Northwestern! In fact, Northwestern is one of the teams that kicked a field goal on 4th and 1 inside the 25 yard line too! There's a reason they're leading the Big Ten in bad 4th down decision making.

The Gray Zone (4th downs 25-49 yards to the end zone)

4th Down Distance

Punts

Average Next Score After Punt

Go for It

Average Next Score after Go for it

Field Goals

Average Next Score After FG

1

0

0

5 3.6 1 -7

2-5

13 0.69 9 -0.89 3 -0.33

6-9

14 1.29 7 -1.57 5 2.6

10+

18 -1.39 1 7 8 1.86

Thankfully, teams are correctly going for it on 4th and 1 in the Gray Zone. But once the distance becomes 2 yards or more, punting reigns supreme. And the teams that are being aggressive aren't being rewarded. Going for it has the lowest average next score for both the 2-5 and the 6-9 groups. Usually, going for it is the optimal decision, especially when the distance is only 2-5 yards. So we'll have to see if things even out anymore as the season goes along. 

Beware of Phantom Degrees of Freedom that Haunt Your Regression Models!

Image may be NSFW.
Clik here to view.
Demon
As Halloween approaches, you are probably taking the necessary steps to protect yourself from the various ghosts, goblins, and witches that are prowling around. Monsters of all sorts are out to get you, unless they’re sufficiently bribed with candy offerings!

I’m here to warn you about a ghoul that all statisticians and data scientists need to be aware of: phantom degrees of freedom. These phantoms are really sneaky. You can be out, fitting a regression model, looking at your output, and thinking everything is fine. Then, whammo, these phantoms get you! They suck the explanatory and predictive power right out of your regression model but, deviously, leave all of the output looking just fine. Now that’s truly spooky!

In this blog post, I’ll show you how these phantoms work and how to avoid their dastardly deeds!

What Are Normal Degrees of Freedom in Regression Models?

I’ve written previously about the dangers of overfitting your regression model. An overfit model is one that is too complicated for your data set.

You can learn only so much from a data set of a given size. A degree of freedom is a measure of how much you’ve learned. Your model uses these degrees of freedom with every parameter that it estimates. If you use too many, you’re overfitting the model. The end result is that the regression coefficients, p-values, and R-squared can all be misleading.

You can detect overfit models by looking at the number of observations per parameter estimate and assessing the predicted R-squared. However, these methods won’t necessarily detect the misbegotten effects of summoning an excessive number of phantom degrees of freedom!

In the degrees of freedom (DF) column in the ANOVA table below, you can see that this regression model uses 3 degrees of freedom out of a total of 28. It appears that this model is fine. Or is it? <Cue evil laugh!>

Image may be NSFW.
Clik here to view.
Analysis of variance table for a regression model

What Are Phantom Degrees of Freedom?

Phantom degrees of freedom are devilish because they latch onto you through the manner in which you settle on the final model. They are not detectable in the output for the final model even as they haunt your regression models.

Image may be NSFW.
Clik here to view.
Guy surrounded by demons

The dangers of invoking too many phantom degrees of freedom!

Every time your incantation adds or removes predictors from a model based on a statistical test, you invoke a phantom degree of freedom because you’re learning something from your data set. However, even when you summon many phantom degrees of freedom during the model selection process, they are not evident in Minitab’s output for the final model. That is what makes them phantoms.

When you invoke too many phantoms, your regression model becomes haunted. This occurs because you’re performing many statistical tests, and every statistical test has a false positive rate. When you try many different models, you're bound to find variables that appear to be significant but are correlated only by chance. These relationships are nothing more than ghostly apparitions!

To protect yourself from this type of bewitching, you need to understand the environment that these phantoms inhabit. Phantom degrees of freedom have the strongest powers when you have a small-to-moderate sample size, many potential predictors, correlated predictors, and when the light of knowledge does not illuminate your conception of the true model.

In this scenario, you are likely to fit many possible models, adding and removing different predictors, and testing curvature and interaction terms in an attempt to conjure an answer out of the darkness. Perhaps you use an automatic incantation procedure like stepwise or best subsets regression. If you have multicollinearity, the parameter estimates are particularly unhinged.

The ANOVA table we saw above appears to be perfectly normal, but it could be haunted. To divine the truth, you must understand the entire ritual that incited the final model to materialize. If you start out with 20 variables, a sample size of 29, and fit many models to see what works, you could conjure a possessed model beguiling you to accept false conclusions.

In fact, this method of dredging through data to see what sticks casts such a diabolical spell that it can manifest a statistically significant regression model with a high R-squared from completely random data! Beware—this is the environment that the phantoms inhabit!

How to Protect Yourself from the Phantom Degrees of Freedom

To protect yourself from phantom degrees of freedom, information and advance planning are your best talismans. Use the following rites to shine the light of truth on your research and to guide yourself out of the darkness:

  • Conduct prior research about the important variables and their relationships to help you specify the best regression model without the need for data mining.
  • Collect a large enough sample size to support the level of model complexity that you will need.
  • Avoid data mining and keep track of how many phantom degrees of freedom that you raise before arriving at your final model.

For more information about avoiding haunted models, read my post about How to Choose the Best Regression Model.

Happy Halloween!

 

"Buer." Licensed under Public Domain via Commons.

"The Thing" and Your Data: Meet the Shapeshifter Distribution

Since it's the Halloween season, I want to share how a classic horror film helped me get a handle on an extremely useful statistical distribution. 

Image may be NSFW.
Clik here to view.
The film is based on John W. Campbell's classic novella "Who Goes There?", but I first became  familiar with it from John Carpenter's 1982 film The Thing.  

In the film, researchers in the Antarctic encounter a predatory alien with a truly frightening ability: it can assume the form of any living thing it touches. It's a shapeshifter. A mimic with the uncanny ability to take on the characteristics of other beings. Soon, the researchers realize that they can no longer be sure who among them is really human, and who is not. 

So what does that have to do with statistics? Meet the Weibull distribution, or, as I like to think of it, "The Thing" of statistical distributions. The Weibull distribution can take on the characteristics of many other distributions. The good news is, unlike The Thing, the Weibull distribution's ability to shapeshift is very helpful. 

The Weibull Distribution Can't Be Nailed Down 

Because the Weibull distribution can assume the form of many different distributions, it's a favorite among quality practitioners and engineers, and it's by far the most commonly used distribution for modeling reliability data. Just like "The Thing," the Weibull distribution is adaptable enough to be able to pass for other things—in this case, a variety of other distributions. 

Got right-skewed, left-skewed, or symmetric data? You can model it with Weibull, no problem. That flexibilidty lets engineers use  Weibull distribution to evaluate the reliability of everything from ball bearings to vacuum tubes.

The Weibull distribution can also model hazard functions that are decreasing, increasing or staying constant, so it to be used to model any phase of an item’s lifetime, from right after launch to the end of its usefulness.

How "The Thing" the Weibull Curve Changes Shape

To illustrate how flexible the Weibull distribution is, let's look at some examples in Minitab Statistical Software. (Care to follow along, but don't have Minitab? Just download the free 30-day trial.) 

Start by choosing Graph > Probability Distribution Plot, which brings up this dialog box: 

 Image may be NSFW.
Clik here to view.
probability distribution plots

Select "View Single," and then choose "Weibull" in the Distribution drop-down menu. The subsequent dialog box will let you specify three parameters: shape, scale, and threshold.  

Image may be NSFW.
Clik here to view.

The threshold parameter indicates the distribution's shift away from 0. A negative threshold will shift the distribution to the left of 0, while a positive threshold shifts it to the right. (All data must be greater than the threshold.)

The scale parameter is the 63.2 percentile of the data, and this value defines the Weibull curve's relation to the threshold, in the same way that the mean defines a normal curve's position. For our purposes, let's say we're testing reliability, and that 63.2 percent of the items we test fail within the first 10 hours following the threshold time. So our scale would be 10.

The shape parameter, unsurprisingly enough, describes the Weibull curve's shape. Changing the  shape value enables you to use Weibull to model the characteristics of many different life distributions.

Entire books have been written about how these three parameters affect the characteristics of the Weibull distribution, but for this discussion we'll focus on how the value of shape can influence the curve. I'll show these examples one-by-one, but you can have Minitab display them together on a single plot if you select "Vary Parameters" instead of "View Single" in the first dialog box shown above. 

Weibull with Shape Less Than 1

Let's start with a shape between 0 and 1. You may choose any value you like in that range. I'm going to enter enter 0.4, and when I press "OK", Minitab gives me the graph below: 

The graph shows that probability decreases exponentially from infinity. If you're thinking about reliability, or the rate of failures, the Weibull distribution with these parameters would fit data that have a high number of initial failures. Then the failures decrease over time as the defective items are eliminated from the sample. These early failures are frequently referred to as "infant mortality," because they occur in the early stage of a product's life.  

Image may be NSFW.
Clik here to view.
Weibull Distribution with shape between 0 and 1

Weibull with Shape = 1

When the shape is equal to 1, the Weibull distribution decreases exponentially from 1/alpha, where alpha = the scale parameter. In other words, the failure rate remains fairly consistent over time. This Weibull distribution's shape is applicable to data about random failures and multiple-cause failures, and can be used to model the useful life of products. 

Image may be NSFW.
Clik here to view.
Weibull Distribution with shape = 1

Weibull with Shape Between 1 and 2

When the shape parameter is between 1 and 2,  Weibull crests quickly, then decreases more gradually. The most rapid failure rate occurs initially. This shape indicates failures due to early wear-out. 

Image may be NSFW.
Clik here to view.
Weibull Distribution with shape value between 1 and 2

Weibull with Shape = 2

When the shape parameter is equal to 2, Weibull approximates a linearly increasing failure rate, where the risk of wear-out failure increases steadily over the product's lifetime. (This variant of the Weibull distribution is also referred to as the Rayleigh distribution.)

Image may be NSFW.
Clik here to view.
Weibull Distribution with Shape = 2 AKA Rayleigh Distribution

Weibull with Shape Between 3 and 4

When the shape parameter falls between 3 and 4, Weibull becomes symmetric and bell-shaped, like the normal curve. For reliability, this form of the distribution suggests rapid wear-out failures during the final period of product life, when most failures happen.

Image may be NSFW.
Clik here to view.
Weibull distribution symmetric shape value = 3.5

Weibull with Shape > 10

When the shape is more than 10, the Weibull distribution is similar to an extreme value distribution. This form of the distribution can approximate the final stage of a product's life. 

Image may be NSFW.
Clik here to view.
Weibull Distribution shape value = 20 skewed

Will "The Thing" Weibull Always Win?

When it comes to analyzing reliability, Weibull is the de facto default distribution, but other distribution families also can model a variety of distributional shapes. You want to find the distribution that gives you the very best fit for your data, and that may not be a Weibull. For instance, the lognormal distribution is typically used to model failures caused by chemical reactions or corrosion.

To assess the fit of your data using Minitab’s Distribution ID plot, you can use Stat > Reliability/Survival > Distribution Analysis (Right-Censoring or Arbitrary Censoring). If you want more details about that, check out this post on identifying your data's distribution.  


The Top 10 Halloween Content Creators on YouTube

I generally consider myself old-fashioned. I’m not particularly different on Halloween, where I dress up, pass out candy, sit down in front of the television to watch "It’s the Great Pumpkin, Charlie Brown" on ABC, and read Minitab blog posts from Halloweens past.

But some younger folks have told me that I’m missing out, so I’m trying to broaden my horizons to include YouTube. I want to use data to decide what to look at, of course, so I’ve turned to tubularlabs.com. They keep a top 10 list of the YouTube creators who have the most Halloween views. As of 10/30/2015 at 9:00 a.m., here’s what that data looks like on a Minitab bar chart:

Image may be NSFW.
Clik here to view.
Nursery rhymes are extremely popular on YouTube. Check out Booya.

Not having much experience with YouTube, the only name I recognized was Jimmy Kimmel. After some investigation, here are some things I learned:

OlafVids is a parody series featuring the princesses from Disney’s "Frozen." If you’ve always wanted to see Jack Frost court Elsa, this is a good option. The Halloween appeal is not clear to me, but you could spend 40 seconds of your life watching Evil Elsa and Maleficent dance to exhaustion, if you want.

Booya, oh my genius, Haunted House, Kids Channel, and Kids Tv offer nursery rhyme content with Halloween cartoons. While this might be okay for some, I’m going to pass. Even when the singers change, there’s only so much “Wheels on the Bus” I can take.

Mejores Juguetes offers content for youngsters in Spanish, but not nursery rhymes. This whole week features Halloween content where you can watch someone else play with a toy that fits a Halloween theme. I don’t immediately get the appeal of watching a video of someone else playing with stuff you want to have for 10-20 minutes, but the Halloween content is easy to find.

Image may be NSFW.
Clik here to view.
Phantasm 5 tall.jpg
Keyblade is also in Spanish, but features content of interest to older individuals. Keyblade seems to produce a lot of rap battles either about video games or between video game characters. Several of them are blocked if you’re in restricted mode, including the epic rap battle between King Kong and Godzilla, which, as an old fashioned guy, I’m guessing is responsible for Keyblade’s Halloween popularity. Younger folks tell me it might be the Slenderman content, but I can’t really believe anyone finds that guy frightening. Give me the Tall Man any day.

SA Wardega has over 152 million views of a video called “Mutant Giant Spider Dog.” Someone’s dog is dressed up in a cute spider costume for Halloween and featured in a wordless, 3-minute horror movie. This I get, although I’m not sure whether it rises to the level of a new Halloween tradition. Time will tell.

 

You know exactly what to expect from Jimmy Kimmel. Here’s a video of people telling their kids they ate all of their Halloween candy.

One of the nice features in Minitab is that when you apply a value order to a column, that order becomes the default order for the categories on a bar chart. Here’s the above chart in order by the amount of time I might spend on that YouTube channel for Halloween.

Image may be NSFW.
Clik here to view.
For Halloween, I'm mostly like to watch people tell their kids they are all of their Halloween candy.

Want to apply a value order to your own chart? Check out Change the value order in Minitab output in the Support Center. Want to see more kids upset about their Halloween candy? That’s here.

Happy Halloween!

Happy World Quality Month!

Image may be NSFW.
Clik here to view.
Did you know that November is World Quality Month? The American Society for Quality is once again heading up this year’s festivities.

Throughout the month of November, ASQ will be promoting the use of quality tools in businesses, communities, and institutions all over the world. You can check it out at http://asq.org/world-quality-month/.

Here at Minitab, we’re also pretty excited about World Quality Month. Below are a couple of ways we’re hoping that you’ll celebrate with us.

Share your quality story

We want to hear from you!

How are you using quality tools and functions in Minitab software to achieve quality improvement goals at work or at home? Tell us your story here. Selected stories may be featured right here on the Minitab Blog, on Minitab.com, or in a future issue of Minitab News.

Attend a webinar or watch a recorded webcast

It’s always a good idea to brush up on your data analysis skills, especially during World Quality Month! Check out our upcoming free webinar offerings, including preview trainings for Design of Experiments and Process Capability. Register Now!

Can’t make a live webinar? View our recorded webcasts instead.

Read about the success of others

We’re very proud to recognize the accomplishments of our customers. The companies that use our products and services come in all sizes, represent all industries, and are located all over the world. Get ideas and read about how others are improving quality. Read Case Studies.

Are you doing anything special to commemorate World Quality Month? Tell us in the comments section!

A Swiss Army Knife for Analyzing Data

Easy access to the right tools makes any task easier. That simple idea has made the Swiss Army knife essential for adventurers: just one item in your pocket gives you instant access to dozens of tools when you need them.  

If your current adventures include analyzing data, the Editor menu in Minitab 17 is just as essential.

Whether you're organizing a data set, sifting through Session window output, or perfecting a graph, the multifaceted Editor menu adapts so you never need to search for the perfect tool.

Minitab’s Dynamic Editor Menu

Any job goes more smoothly when you have easy access to the right tools. A surgeon can better concentrate on an operation when a nurse is there with all of the necessary implements. A golfer can better focus on the perfect swing when a caddy waits nearby with the perfect club.

You may not have an assistant who drops the right tool into your hand at the moment you need it, but Minitab’s dynamic Editor menu comes close. Whether you’re organizing a data set, sifting through Session window output, or perfecting a graph, the Editor menu adapts so that you never have to search for the perfect tool.

Only the tools that you need

The Editor menu contains only the tools that apply to the current task. When you’re working with a data set, the menu contains only items for use in the worksheet, when a graph is active, the menu contains only graph-related tools, and so on.

Graphing

When a graph window is active, the Editor menu contains over a dozen graph tools. Here are a few of them.

Image may be NSFW.
Clik here to view.

ADD

Use Editor > Add to add reference lines, labels, subtitles, and much more. The contents of the Add submenu change depending on the type of graph.

Image may be NSFW.
Clik here to view.

MAKE SIMILAR GRAPH

Image may be NSFW.
Clik here to view.

The editing features in Minitab graphs make it easy to create a graph that looks just right. But it may not be easy to reproduce that look a few hours (or a few months) later.

With most graphs, you can use Editor > Make Similar Graph to produce another graph with the same edits, but with new variables.

Entering data and organizing your worksheet

When a worksheet is active, the Editor menu contains tools to manipulate both the layout and contents of your worksheet. You can add column descriptions; insert cells, columns or rows; and much more, including the items below.

VALUE ORDER

Image may be NSFW.
Clik here to view.

By default, Minitab displays text data alphabetically in output. But sometimes a different order is more appropriate (for example, “Before” then “After”, instead of alphabetical order). Use Editor > Column > Value Order to ensure that your graphs and other output appear the way that you intend.

ASSIGN FORMULA TO COLUMN

Image may be NSFW.
Clik here to view.

You can assign a formula to a worksheet column that updates when you add or change data.

Image may be NSFW.
Clik here to view.

Session window

As the repository for output, the Session window is already an important component of any Minitab project, but the Editor menu makes it even more powerful.

ENABLE COMMANDS

Image may be NSFW.
Clik here to view.

Image may be NSFW.
Clik here to view.

Most users rely on menus to run their analyses, but you can extend the functionality of Minitab and save time on routine tasks with Minitab macros. Enable the command language to familiarize yourself with the commands that are generated with each analysis, which opens the door to macro writing.

NEXT COMMAND / PREVIOUS COMMAND / FIND

Image may be NSFW.
Clik here to view.

After you run several analyses, you may have a great deal of output in your Session window. This group of items makes it easy to find the results that you want, regardless of project size.

Graph brushing

Graph exploration sometimes calls for graph brushing, which is a powerful way to learn more about the points on a graph that interest you. Here are two of the specialized tools in the Editor menu when you are in “brushing mode”.

SET ID VARIABLES

Image may be NSFW.
Clik here to view.

It’s easy to spot an outlier on a graph, but do you know why it’s an outlier? Setting ID variables allows you to see all of the information that your dataset contains for an individual observation, so that you can uncover the factors that are associated with its abnormality.

CREATE INDICATOR VARIABLE

Image may be NSFW.
Clik here to view.

As you brush points on a graph, an indicator variable “tags” the observations in the worksheet. This enables you to identify these points of interest when you return to the worksheet.

Putting the Dynamic Menu Editor to use

Working on a Minitab project can feel like many jobs rolled into one—data wrestler, graph creator, statistical output producer. Each task has its own challenges, but in every case the Editor menu should be your toolbox of choice.

Surgeons have nurses. Golfers have caddies. You’ve got Minitab. The next time you’re looking for a tool to make your job a little easier, let the Editor menu be your personal assistant.

Big Ten 4th Down Calculator: Week 6

Image may be NSFW.
Clik here to view.
4th and 1. It's a situation where the Big Ten 4th down calculator will never say to kick (unless, of course, it's the end of the game and a field goal will tie or take the lead). But what would it take to have the statistics suggest a punt? The key here is how far the punt travels. Last year the average Big Ten punt traveled about 40 yards. Using this value, in your own territory you'll score about 0.56 more points by going for it over kicking. So how far would the punt have to travel for the calculator to suggest punting?

48 yards.

The best punter in the NFL this year nets only 46 yards per punt. So good luck ever finding a college punter who can average 48 yards a punt. And keep in mind this all assumes that if you successfully convert the 4th and 1, you gain only one yard. But quite often, you'll gain more yards, putting yourself in an even better scoring position than the calculator takes into account. In fact, Ohio State had a 4th and 1 at their own 35 yard line earlier this year, and they scored a 65 yard touchdown on the play.

If all that wasn't enough, there is more bad news for the punting enthusiasts. It appears that my value of 40 yards for a punt didn't include any return yardage. Luckily, this year ESPN has statistics for both how far the punt traveled and the net yardage of the punt. The latter value is what we want to use for the 4th down calculator, since it best represents where the other team will start their drive. And this year, only two Big Ten teams (Ohio State and Iowa) have a net punt yardage that is 40 yards or more. The average value in the Big Ten is 37 yards. So moving forward, I'm going to use this value for net punt yardage instead of 40. 

This change suggests you're giving up even more points by punting on 4th and 1 (0.79 points) and 4th and 2 (0.41 points). It also changes the calculator's decision on 4th and 3 from punting to going for it! However, the difference in expected points is 0.068, which is so close to 0 that, really, either decision on 4th and 3 is fine. And I bet you'll never guess which decision coaches will make 99% of the time!

If you're new to this and want to quick recap on what exactly the 4th down calculator is, please see the beginning of last week's post for details. Now let's get on to the games!

Penn State 39 - Illinois 0

Penn State completely dominated Illinois in this game.

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Illinois

10 0 10 0 0 0 Penn State 6 3 5 1 0 1


Illinois had 10 punts, including 4 in Penn State territory. And yet the distance was so long that the calculator didn't disagree with a single one of them. The best 4th down distance Illinois had was 4th and 9, and four of their 4th downs required them to get more than 15 yards! As I said, complete dominance by Penn State in this game.

Penn State had three incorrect 4th down decisions. None of them was terrible, but none the less they added up to a whole point that Penn State left on the board. Although, it's not like they needed it. Early on, Penn State punted on a 4th and 2 from their own 40 when the model says you should go for it. They later punted on 4th and 4 from their own 43 yard line. Again, the model says they should have gone for it, but the difference in expected points is only 0.07 points and at the time Penn State was up 22-0 in the 3rd quarter. With the expected points that close and your team already holding a large lead, there really isn't any issue with a punt there.

Penn State's worst 4th down decision came on their next drive. With a 4th and 4 at the Illinois 11 yard line, they kicked a field goal instead of going for it. Sure, teams convert on 4th and 4 less than half of the time (46%), but even if they failed Illinois would start the ball at their own 11 yard line. With that kind of field position, Penn State would still likely be the next team to score. The decision cost the Penn State just over half a point. It didn't matter this game, but with their remaining opponents being Northwestern, Michigan, and Michigan State, the Nittany Lions will most likely need all the points they can get.

Wisconsin 48 - Rutgers 10

More domination—by the Badgers over the Scarlet Knights.

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Rutgers

8 1 6 1 1 0.52 Wisconsin 5 1 3 1 1 0.19

With the game only 10-0 late in the 1st quarter, Rutgers had a 4th and goal at the Wisconsin 3 yard line. They kicked a field goal when the calculator says on average you'll score half a point more by going for it. The decision was bad to begin with, but it's even worse when you consider the game situation. Rutgers was a 21-point underdog, so they should have been making aggressive 4th down decisions in an attempt to pull an upset. Even if the statistics slightly favored a field goal, Rutgers should have considered going for it. But kicking a field goal when going for it was the optimal choice? Well, that's not how you pull an upset. Opportunity missed.  

However, Rutgers did play aggressively later in the game. On a 4th and 5 at the Wisconsin 29 yard line, they decided to go for it. The calculator suggests kicking a field goal, but the difference in expected points between kicking and going for it is only 0.27 points, and at the time Rutgers was down 21 points. Considering what I said previously about Rutgers having to make aggressive 4th down decisions, I think going for it was absolutely the correct decision. So although technically it's a disagreement with the 4th down calculator, I'm not going to count it against Rutgers.

Wisconsin's disagreement with the calculator was an interesting one. Earlier in the game, they had a 4th and 14 at the Rutgers 31 yard line. They correctly decided to kick a 49 yard field goal and they made it. Later in the game, they had a 4th and 7 at the Rutgers 31 yard line. Instead of kicking another field goal from the same distance, they decided to go for it. The calculator suggests kicking the field goal (especially since we already know their kicker has that kind of range), but you'll see that the difference in expected points is only 0.19. So the decision to go for it isn't a bad one. Keep in mind the calculator assumes you'll pick up exactly 7 yards if you convert, but in reality you'll often gain even more yards, which makes the decision to go for it even stronger. That's exactly what happened here, as Wisconsin threw a 31-yard touchdown pass on the play, taking a 17-3 lead and never looking back. 

Iowa 31 - Maryland 15

Iowa stays undefeated, and their last 4 games come against teams with a combined 3 wins in the Big Ten. A trip to the Big Ten championship game looks likely for the Hawkeyes.

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Maryland

6 0 5 0 1 0 Iowa 6 0 4 1 1 0

Lots of good 4th down decision-making in this game. Iowa correctly went for it on a 4th and 5 at the Maryland 30 yard line. Maryland went for it an converted on a 4th and 1. And later on that same drive Maryland had 4th and 6 at the Iowa 35. The calculator suggests kicking a field goal, but the expected points for going for it are just slightly less than kicking a field goal. And at the time, Maryland was down 21-0, so they needed points, making going for it the correct decision. And that's exactly what the Terrapins intended to do.

Key word, intended.

Maryland wasn't able to get the play off in time, and got a 5 yard delay-of-game penalty. With a 4th and 11 at the Iowa 40, they correctly punted. But their intended aggressiveness didn't go unrewarded, as the next score in the game was a Maryland touchdown, which they followed up with a surprise onside kick! The onside kick was not successful, but as heavy underdogs Maryland was making the correct decision to play aggressively and take some chances. Unfortunately, it didn't work out for them, as Iowa took control in the 4th quarter and went on to a double-digit victory.

Purdue 55 - Nebraska 45

Are we sure this isn't a basketball score?   

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Nebraska

4 1 3 1 0 0.79 Purdue 5 1 4 0 1 0.07

In the second quarter, Nebraska punted on a 4th and 1. Sure, it was at their own 29 yard line, but teams successfully convert on 4th and 1 so often that it's well worth the risk of turning the ball over in your own territory. The model suggests Nebraska lost 0.79 points by punting. Luckily for the Cornhuskers, the punt traveled 59 yards, Purdue went 3 and out on their next possession, and neither team scored again until after halftime. Unluckily for the Cornhuskers, their defense gave up 55 points and they badly could have used points on that possession.

Purdue punted on a 4th and 3 on their own 19 yard line. Because the calculator now uses a net yardage of 37 yards per punt instead of 40, it's always going to suggest going for it—even when you're at your own 19 yard line. However, you'll see that the difference is almost 0, so really, either decision is fine. And in this particular case, just under 2 minutes remained until halftime. Even if Purdue converted, they still had a long way to go to score in less than 2 minutes. And if they turned the ball over, Nebraska would have great field position where the time remaining wouldn't matter. So punting was definitely the correct call, and I won't include the decision in the team summary. 

Speaking of correct calls, Purdue went for a 4th and 1 at the Nebraska 10 yard line. This should always be an easy decision: you should go for it every time. But you'll often year announcers say that coaches should "take the points" and kick the field goal. Well, making that decision at your opponent's 10 yard line would cost you 1.56 points. Purdue made the correct call and went for it over kicking. They converted, and scored a touchdown on the very next play. Take the points, indeed!

4th Down Decisions in the 4th QuarterTeamTime Left4th Down DistanceYards to End ZoneCalculator DecisionCoach DecisionWin Probability Go For ItWin Probability Punt Purdue
(Up by 11) 6:19 1 37 Go for it Go for it 98.4% 98.3%

There's just one 4th down decision in the 4th quarter worth mentioning. Purdue went for it on a 4th and 1 to try to put the game away. The calculator suggested going for it, but Purdue's win probability was so high that either decision would have been fine. But what if things were a little different? Let's say that everything about this situation were the same, except the scored was tied. In that case, Purdue would have a win probability of 58.7% by going for it, and 47.1% by punting. That just shows how much the margin can affect the win probabilities. 

In the actual game, Purdue threw an incomplete pass on 4th and 1, but their aggressiveness wouldn't go unrewarded. Nebraska's very next play was an interception that Purdue returned to the 6 yard line. And the next play put Purdue in the end zone, effectively ending the game.

Michigan 29 - Minnesota 26

Oh man, do we have some things to talk about here.

4th Down Decisions in the First 3 Quarters

Team

4th Downs

Number of Disagreements with the 4th down Calculator

Punts

Field Goals

Conversion Attempts

Expected Points Lost

Michigan 5 1 4 0 1 0.41 Minnesota 6 2 4 2 0 0.58

With the game still scoreless in the first quarter, Minnesota found themselves with a 4th and goal at the 5 yard line. This is a place where you'll almost always see coaches kick a field goal, but the statistics actually slightly favor going for it. However, the difference is only 0.17 points, so coaches should consider other factors in their decision making. In this game Minnesota was a 2 touchdown underdog, so they should have been aggressive in their 4th down decision making and went for it. But instead they followed the norm and kicked the field goal.

Michigan didn't have a single 4th down until there was less than 6 minutes left in the 2nd quarter. But that 4th down was their one disagreement, as they punted on 4th and 2 near midfield when the calculator suggests going for it. And the result didn't work out well for the Wolverines, as after the punt it took Minnesota all of 3 plays to scored a touchdown.

With a minute left in the first half, Michigan had a 4th and 6 at the Minnesota 37 yard line. The calculator suggests a field goal, but the Michigan kicker has never attempted a field goal of 50 yards or more, so it's safe to assume a 54 yard field goal is not in his range. With the field goal ruled out, the correct decision is to go for it, and that's exactly what Michigan did. But the correct decision still ended in a bad result for Michigan, as they lost a fumble on the play and Minnesota was able to kick a field goal right before halftime.

Minnesota's other disagreement came on a 4th and 2 near midfield. Minnesota decided to punt when they should have gone for it, a decision that cost the Gophers 0.41 points. Again, being a two touchdown underdog, they should have been more aggressive.

4th Down Decisions in the 4th QuarterTeamTime Left4th Down DistanceYards to End ZoneCalculator DecisionCoach DecisionWin Probability Go For ItWin Probability Kick Minnesota
(Up by 2) 11:43 5 29 Go for it FG 55% 53.5% (FG) Michigan
(Down by 5) 10:04 4 69 Go for it Punt 25.8% 24.9% (Punt) Minnesota
(Up by 5) 8:36 19 89 Punt Punt 39.8% 52% (Punt) Minnesota
(Down by 3) 0:02 1 1 Go for it Go for it 58.6% 52.5% (FG)

Minnesota missed another chance to be aggressive when they kicked a 47 yard field goal on 4th and 5. A field goal is far from a sure thing here, as Big Ten kickers make a 47 yard field goal about 60% of the time. And a field goal still leaves Michigan the opportunity to take the lead with a touchdown. Going for it leaves the possibility of Minnesota scoring a touchdown, which would force Michigan to score twice. And at the very least a successful conversion would take more time off the clock and set up an easier field goal attempt.

After Minnesota made their field goal attempt, Michigan had a 4th and 4 at their own 31 with 10 minutes left. The calculator actually suggests Michigan should have gone for it here. If you're losing in the 4th quarter, odds are you're going to have to go for it on 4th down at some point. You so often see coaches punt on 4th and short, only to have to go for it on 4th and long later in the game. Fourth and 4 is pretty manageable, so under normal circumstances Michigan should have gone for it. But these weren't normal circumstances.

Remember, the calculator's recommendations aren't meant to be written in stone. You should also consider the game situation before making a decision. At the time, Michigan had put in their backup quarterback because of an injury to their starter, and the backup quarterback had yet to complete a pass. Putting the entire game on his shoulders right here was probably a little too much to ask. Plus, the strength of this Michigan team is their defense. It's hard to quantify how much those factors affect the probabilities, but I think it's enough to make punting the correct decision here. The Michigan defense then rose to the occasion, forcing Minnesota into a 4th and 19, where a punt was absolutely the correct decision.

Remember when I said that often you'll see coaches punt on 4th and short only to have to go for it on 4th and long later? Well, this was not one of those times. Michigan was able to score the go-ahead touchdown without having to go for it on a single 4th down. Minnesota got the ball back and drove down the field. They had a 4th and 5 with 1:35 left where going for it was such an easy decision I didn't even include it in the table. Then, after a big pass play and some terrible clock management, Minnesota had a 2nd and goal at the 1 yard line with two seconds left in the game. This sets up quite the epic decision, as Minnesota decided to go for the win instead of kicking the game tying field goal. 

With sports analysis, too often we use the result to decide whether a decision was correct or not. Minnesota was stopped short of the end zone, and immediately everybody questioned whether they should have kicked the field goal instead. But the coach couldn't tell the future. So to decide whether his decision was correct or not, we should only use the information that was available to him at the time. And that means we have to ignore the result. 

With two seconds left, you're only getting off one play, so we can treat this like a 4th and goal at the 1. Big Ten teams convert 4th and goal at the 1 58.6% of the time, so that is where the win probability for going for it comes from. To calculate the win probability of overtime, I multiplied 0.937 * 0.56. The 0.937 is the probability that Minnesota makes the field goal. It's pretty high, but remember that kicks are never automatic. Even the most routine kick can end in disaster, as the Michigan/Michigan State game showed us. The 0.56 is Minnesota's probability of winning in overtime, as I have previously found that home field advantage still exists in college football overtimes.

One thing that this doesn't consider is the fact that Michigan was a heavy favorite going into this game. Does the favorite win more often in college football overtimes, or is it closer to 50/50 (after accounting for home field advantage)? I don't know the answer, and couldn't find anything about it online. But if favorites are more likely to win, then all this would do is lower the win probability for kicking a field goal. And since it's already lower than the probability of going for it, we can safely say that Minnesota made the correct decision to go for the win at the end of the game.

It's just too bad the result didn't work out for them.

Summary

Each week, I’ll summarize the times coaches disagreed with the 4th down calculator and the difference in expected points between the coach’s decision and the calculator’s decision. I’ll do this only for the 1st 3 quarters since I’m tracking expected points and not win probability. I also want to track decisions made on 4th and 1, and decisions made between midfield and the opponent’s 25 yard line. I call this area the “Gray Zone.” These will be pretty sparse now, but will fill up as the season goes along. Then we can easily compare the actual outcomes of different decisions in similar situations.

Team Summary Team Number of Disagreements Total Expected Points Lost Northwestern 8 5.78 Minnesota 8 3.4 Indiana 5 3.3 Illinois 7 3.21 Rutgers 6 3.21 Penn State 7 3.2 Michigan 6 3 Nebraska 6 2.57 Michigan State 5 2.18 Iowa 3 1.8 Wisconsin 4 1.37 Ohio State 3 0.92 Purdue 1 0.24 Maryland 1 0.04

Since Northwestern was on a bye last week, we'll take a break from being disappointed in Pat Fitzgerald and instead talk about the teams at the bottom of this list. At first glance, you might think Maryland has been making great 4th down decisions. But the real reason they're so low is that they're always in 4th and long. Of the 25 4th downs the calculator has tracked for Maryland, 13 of them have been 4th and 10 yards or longer. And 23 of them have been 4th and 5 yards or longer. Correctly deciding to punt on 4th and long really isn't that impressive. So let's focus on the team that should be getting credit.

Purdue.

The Boilermakers have gone for it on 4th down 26 times this season—that's the most in the college football. So it's no surprise that their decisions have agreed with the calculator for the most part. In fact, the one disagreement Purdue had was going for it when the calculator suggested kicking a field goal. But even in that case, they were heavy underdogs against Michigan State, so being aggressive was probably the correct decision.

Purdue has had four different 4th and 1's this season (in the first 3 quarters), and have correctly gone for it every time. And in their upset win over Nebraska this last week, they went for it on 4th and 1 three different times (with two of them coming in the 4th quarter). Will the aggressive play calling result in another upset win this season? Possibly on November 21st, against what could be an undefeated Iowa team? The 4th down calculator is anxious to find out!

In the meantime, great 4th down decision making Boilermakers. Keep it up! 

4th and 1

Yards To End Zone

Punts

Average Next Score After Punt

Go for It

Average Next Score after Go for it

Field Goals

Average Next Score After FG

75-90

2

7

0

0

*

*

50-74

15 -0.67 3 4.67

*

*

25-49

0

0

6 1.83 1 -7

1-24

*

* 8 2.13 3 3


Not much change here from last week. Teams are still waiting until they cross midfield to be aggressive on 4th and 1. Next week we have a full slate of 7 Big Ten conference games, so I hope we can get some more 4th and 1's and start increasing these sample sizes!

The Gray Zone (4th downs 25-49 yards to the end zone)

4th Down Distance

Punts

Average Next Score After Punt

Go for It

Average Next Score after Go for it

Field Goals

Average Next Score After FG

1

0

0

6 1.83 1 -7

2-5

16 0.63 11 -0.36 3 -0.33

6-9

15 1 9 -0.78 6 3.33

10+

25 -0.96 1 7 10 2.1

 

How Many Episodes Does It Take to Get Hooked on a TV Show?

I have two young children, and I work full-time, so my adult TV time is about as rare as finding a Kardashian-free tabloid.  So I can’t commit to just any TV show. It better be a good one. I was therefore extremely excited when Netflix analyzed viewer data to find out at what point watchers get hooked on the first season of various shows.

Image may be NSFW.
Clik here to view.
Specifically, they identified the episode at which 70% of viewers who watched that episode went on to complete the entire first season. Translation for me: if I can tell early on if I’m going to like a show, I’m game. If the vast majority of viewers get hooked on The Walking Dead and all its zombie apocalypse gore-galore after just 2 episodes and I’m not feeling it by then, I’ll call it a day and move onto the next “it” show.

Which shows get you hooked the fastest?

Although there weren’t any shows where episode 1 was all it took—although I’m pretty sure Downton Abbey had me at the opening scene—there were many shows that came close.

Bates Motel, Breaking Bad, Scandal, Sons of Anarchy, Suits, The Killing, and The Walking Dead tied for the No. 1 spot, each taking only 2 episodes until you’re hooked. Dexter, Gossip Girl, House of Cards, Marco Polo, Orange is the New Black, and Sense8 came in next at 3 episodes each.

As for which shows take the longest, Arrow and How I Met Your Mother required the biggest level of commitment to get hooked, at 8 episodes. However, I noticed that both of these shows were similar in that they had some of the highest numbers of total episodes at 23 and 22, respectively. This got me thinking – does the total number of episodes have an impact on the number of episodes it takes someone to get hooked?

Image may be NSFW.
Clik here to view.

Afraid of a Big Commitment, Anyone?

To see if the total number of episodes has an impact on the number of episodes until you’re hooked, I used Minitab Statistical Software to graph the total number of episodes in season 1 versus the episode hook number.

Image may be NSFW.
Clik here to view.

I also ran the corresponding correlation test, which yielded a p-value of 0.000. Therefore, we can conclude that there is a statistically significant correlation between the two variables. The more episodes in a season, the more episodes it takes until 70% of viewers complete the entire season.

However, every good data analyst knows that correlation does not equal causation. So rather than presuming folks are less likely to make the commitment to view a long season, it could simply be due to the fact that the more episodes there are the greater the opportunity there is a for a person to bail on the series—thus taking longer to hit the 70% barometer that Netflix used for their study.

Now to decide which show to watch next…when I find the time…

Viewing all 828 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>