Minitab | Minitab

After you have fit a linear model using regression analysis, ANOVA, or design of experiments (DOE), you need to determine how well the model fits the data. To help you out, Minitab statistical software presents a variety of goodness-of-fit statistics. In this post, we’ll explore the R-squared (R2 ) statistic, some of its limitations, and uncover some surprises along the way. For instance, low R-squared values are not always bad and high R-squared values are not always good!

What Is Goodness-of-Fit for a Linear Model?

Definition: Residual = Observed value - Fitted value

Linear regression calculates an equation that minimizes the distance between the fitted line and all of the data points. Technically, ordinary least squares (OLS) regression minimizes the sum of the squared residuals.

In general, a model fits the data well if the differences between the observed values and the model's predicted values are small and unbiased.

Before you look at the statistical measures for goodness-of-fit, you should check the residual plots. Residual plots can reveal unwanted residual patterns that indicate biased results more effectively than numbers. When your residual plots pass muster, you can trust your numerical results and check the goodness-of-fit statistics.

What Is R-squared?

R-squared is a statistical measure of how close the data are to the fitted regression line. It is also known as the coefficient of determination, or the coefficient of multiple determination for multiple regression.

The definition of R-squared is fairly straight-forward; it is the percentage of the response variable variation that is explained by a linear model. Or:

R-squared = Explained variation / Total variation

R-squared is always between 0 and 100%:

0% indicates that the model explains none of the variability of the response data around its mean.
100% indicates that the model explains all the variability of the response data around its mean.

In general, the higher the R-squared, the better the model fits your data. However, there are important conditions for this guideline that I’ll talk about both in this post and my next post.

Graphical Representation of R-squared

Plotting fitted values by observed values graphically illustrates different R-squared values for regression models.

Regression plots of fitted by observed responses to illustrate R-squared

The regression model on the left accounts for 38.0% of the variance while the one on the right accounts for 87.4%. The more variance that is accounted for by the regression model the closer the data points will fall to the fitted regression line. Theoretically, if a model could explain 100% of the variance, the fitted values would always equal the observed values and, therefore, all the data points would fall on the fitted regression line.

A Key Limitation of R-squared

R-squared cannot determine whether the coefficient estimates and predictions are biased, which is why you must assess the residual plots.

R-squared does not indicate whether a regression model is adequate. You can have a low R-squared value for a good model, or a high R-squared value for a model that does not fit the data!

Are Low R-squared Values Inherently Bad?

No! There are two major reasons why it can be just fine to have low R-squared values.

In some fields, it is entirely expected that your R-squared values will be low. For example, any field that attempts to predict human behavior, such as psychology, typically has R-squared values lower than 50%. Humans are simply harder to predict than, say, physical processes.

Furthermore, if your R-squared value is low but you have statistically significant predictors, you can still draw important conclusions about how changes in the predictor values are associated with changes in the response value. Regardless of the R-squared, the significant coefficients still represent the mean change in the response for one unit of change in the predictor while holding other predictors in the model constant. Obviously, this type of information can be extremely valuable.

A low R-squared is most problematic when you want to produce predictions that are reasonably precise (have a small enough prediction interval). How high should the R-squared be for prediction? Well, that depends on your requirements for the width of a prediction interval and how much variability is present in your data. While a high R-squared is required for precise predictions, it’s not sufficient by itself, as we shall see.

Are High R-squared Values Inherently Good?

No! A high R-squared does not necessarily indicate that the model has a good fit. That might be a surprise, but look at the fitted line plot and residual plot below. The fitted line plot displays the relationship between semiconductor electron mobility and the natural log of the density for real experimental data.

Regression model that does not fit even though it has a high R-squared value

Residual plot for a regression model with a bad fit

The fitted line plot shows that these data follow a nice tight function and the R-squared is 98.5%, which sounds great. However, look closer to see how the regression line systematically over and under-predicts the data (bias) at different points along the curve. You can also see patterns in the Residuals versus Fits plot, rather than the randomness that you want to see. This indicates a bad fit, and serves as a reminder as to why you should always check the residual plots.

This example comes from my post about choosing between linear and nonlinear regression. In this case, the answer is to use nonlinear regression because linear models are unable to fit the specific curve that these data follow.

However, similar biases can occur when your linear model is missing important predictors, polynomial terms, and interaction terms. Statisticians call this specification bias, and it is caused by an underspecified model. For this type of bias, you can fix the residuals by adding the proper terms to the model.

Closing Thoughts on R-squared

R-squared is a handy, seemingly intuitive measure of how well your linear model fits a set of observations. However, as we saw, R-squared doesn’t tell us the entire story. You should evaluate R-squared values in conjunction with residual plots, other model statistics, and subject area knowledge in order to round out the picture (pardon the pun).

In my next blog, we’ll continue with the theme that R-squared by itself is incomplete and look at two other types of R-squared: adjusted R-squared and predicted R-squared. These two measures overcome specific problems in order to provide additional information by which you can evaluate your regression model’s explanatory power.

by Alex Orlov, guest blogger

While it has been called the "million-dollar methodology" for the significant investment sometimes required to deliver results, Six Sigma has a wealth of practices that can be adapted to small and medium industries, home businesses and even personal finances.

Six Sigma Tips for the Road Ahead Organizations have used Six Sigma as a reliable part of the quality improvement process since 1986. And while a large Six Sigma project could cost anything from $1,000 to $1 million in work-hours and other resources, the results of such projects often far outweigh the investment. In addition to the direct benefits of the project, indirect benefits such as process optimization ensure that the benefits include improved efficiency for years to come.

The question we come to is this: Does Six Sigma have anything to offer a layperson to improve their daily efficiency? As an instructor and practitioner, I see several important lessons from the Six Sigma methodology that can benefit anyone:

Tip 1: A Goal worth having is worth Documenting

Common sense dictates that the first step to achievement is to have goals. From the short span of a day to an entire lifetime, goals help focus our efforts. The first step to personal and business efficiency is to clearly state and document goals for future reference. Six Sigma calls this documentation of goals, plans and potential problems a "project charter."

Project Charter for Six Sigma or Lean Project

A project roadmap or charter template like this one in Quality Companion process improvement software can help you plan and complete any project.

If there is a first step to increased efficiency, stating and writing down clearly defined, well thought-out, specific goals is certainly one of the top contenders.

Tip 2: Planning: Don’t leave home without it

While this is one of the most preached and widely accepted clichés, planning receives more lip service than it does actual effort. In my corporate experience, I have found that few people actually invest time to plan their activities. The first element of efficiency that may be gleaned from Six Sigma is the practice of good planning. So what is a plan? In two words: actions and deadlines. Distilled down to its bare essence, a plan details what you are going to do towards achieving a goal and by when. Adhering to an action plan is the heart of Six Sigma.

Tip 3: Understand WHAT is wrong before asking WHY

One of the biggest problems with efficiency is that hours of effort may be directed at finding a solution without understanding the true nature of a problem. Six Sigma dedicates an entire phase of its methodology to identifying, understanding, and documenting the nature of the problem. As problem-solving individuals and businesses, you need to honestly and candidly identify what your problems are before you proceed to the discussion of why they exist. For example, one organization I worked with spent years trying to "fix" attrition without identifying that poor profitability and quality was the problem. They had been immunized to the profitless, de-motivating environment to the point that they did not realize that people quit simply because they did not want to work in such an environment.

On the road to efficiency, in life as in business, it is vital to recognize a problem exists before you can ask why.

Tip 4: Be specific – Ask “How Much?”

Over the last few years, I have had the opportunity to spend time with business owners who like a ‘take-it-as-it-comes’ approach to doing business. While they have an idea of what they want to do with the business, I have found most of their "goals" were very generic and gave no sense of progress. To quote a young director of a startup I happened to speak with, “My goal is to break even in a year’s time and be consistently profitable in the next three years.” To me, this represents an incomplete goal. As you have no doubt asked, “How profitable do you want to be?”

An important part of having goals is to be able to quantify them. Six Sigma teaches us that if we can’t measure it, we can’t control it. Efficiency is impossible if you don’t know how efficient you are.

Tip 5: Without people, a process will fail

If anything, good Six Sigma is an intensively collaborative effort. From defining a problem to identifying what is important to a customer, from brainstorming for potential solutions to the actual work of implementing solutions, people form the core of a good Six Sigma project. An important lesson here is to collaborate and associate with people who can offer ideas, give constructive criticism, and empower the attainment of your goals.

Six Sigma, like daily efficiency, relies on people working together to achieve a common measurable goal and the effective use of collective intelligence.

Tip 6: Whenever you can, seek out and use data to aid a decision

The human ability to approximate is a mixed blessing. While our powers of estimation are excellent, our precision is not as evolved. Another important lesson from Six Sigma is to ask for, seek out and use data to aid a decision. Six Sigma cannot make you or your business more efficient all by itself. It takes you to make a decision based on what your customers and your stakeholders are telling you.

More often than not, this qualitative information is made available in the form of numerical data. To simplify Six Sigma for the purpose of this discussion, the methodology is about manipulating data in order to give you a true picture of the road to your goals.

Whether it involves a personal budget or transforming a home business or even managing personal efficiency at work, gathering data and analyzing it helps you to know where you are before you begin to improve efficiency.

__________________________________________

About the Guest Blogger:

Alex Orlov learned Lean and Six Sigma from his work as a Quality Assurance Analyst for a BPO company in Eastern Europe, but he sees process optimization everywhere and wants to show that these methods can be applied in some form to all aspects of life. He runs WhatIsSixSigma.net, a site dedicated to simplifying Six Sigma and showing that its tools and methods can be applied with great effectiveness not only to the corporate environment but also to our personal lives.

Would you like to publish a guest post on the Minitab Blog? Contact publicrelations@minitab.com.

When you learned statistics, most of what you learned was centered around the Normal distribution. Maybe you became close friends and you later found out his birth name was Gaussian, but either way you probably just call him Normal.

Normal

You might know Normal’s a pretty popular guy with plenty of relationships with other distributions. There are some obvious connections, like how eNormal is Lognormal, but I thought I’d share some less obvious ones.

You probably already know that by subtracting his mean and dividing by his standard deviation you get Standard Normal.

Equation 2

What if you squared Standard Normal? In that case he’d turn right into Chi-Square(1)! And if you have 2 Standard Normal’s? Square them all and add them up and you get Chi-Square(2). In fact you can do this with any number of Standard Normal’s!

Equation 3

From there it gets a little more complicated. Suppose you divide that Chi-Square(2) by 2 and take the square root. Then you take Standard Normal and divide it by that, and you’re left with T(2) - yep, the same T you use to do t-tests!

Equation 4

Of course we could square that T(2) and suddenly we find ourselves with F(1,2) -- the same F you see at all of those ANOVA parties! But if you’re not a big fan of T and wanted to skip those last few steps, you could have just divided Chi-Square(1) by Chi-Square(2) and you still get F(1,2). So it doesn’t really matter if you were introduced by the Chi-Squares or by the T.

Equation 5

Normal, Standard Normal, Chi-Square, T, and F…these guys are the life of the party and it seems like just about everyone knows them. But F(1,2) has some relatives you probably don’t know. Divide him by 2, and divide that by 1 minus him divided by 2, and you’ll find yourself standing face-to-face with none other than Beta(1/2, 2)!

Equation 6

Beta’s a strange guy. He is so important to so many unique and exciting distributions but usually just stands off by himself trying not to get noticed. Despite all those fun friends he has, his closest relationship might be with boring old Uniform, because Beta(1/2,2) is just a square root away.

Equation 7

Maybe it’s not fair to call Uniform boring just because of his flat shape and simple distribution function. After all, if you really want to get Uniform excited just take the negative natural log of 1 minus Uniform, raise it to a power, and multiply everything by another number. Maybe you didn’t know he was capable of such complicated things, but if you’ll humor him you’ll suddenly find yourself face-to-face with none other than Weibull!

Equation 8

Now Weibull is a really exciting distribution who can take on so many forms and accommodate such a wide variety of processes that you may find him overwhelming. I highly recommend using 1 for that power when changing from Uniform, or if you can’t do that, then raise Weibull to 1 divided by the power you used. Either way you simplify him down to old-fashioned Exponential!

Equation 9

(Quick aside: I never see them together, but you can easily meet Logistic if you know just the right way to work with Exponential or Uniform, and Loglogistic is of course just eLogistic.)

So anyway, Exponential isn’t just a special case of Weibull – he’s a special case of Gamma as well since Exponential(λ) is just Gamma(1,1/λ). Gamma is just like Beta – I bet they’re even related somehow – as he is really important to other distributions but never likes to get up in front of people. If you can get him to talk though, he’ll probably tell you that Gamma(1,1/λ) is just Chi-Square(2) divided by 2λ. Let’s just multiply that by 2λ so we’re left with Chi-Square(2) only.

Equation 10

Wait a minute – didn’t we meet Chi-Square(2) earlier? Oh yeah, that’s right…we were introduced right after we squared two Standard Normal’s and added them together! Well either Standard Normal can be turned right back into Normal if we just multiply by a standard deviation and add a mean, but if we did that we’d just be left with…

…the Normal distribution.

Normal

So did we just meet 13 different distributions, or 13 variations on Normal?

Author's note: A huge thanks to our graphics guru Trevor Calabro, who turned a ridiculous blog post into a ridiculous blog post with awesome graphics!

I recently had the opportunity to talk with Ken Jones, professor of operations and supply chain management at Indiana State University, about a business process improvement course he teaches at the university. The course covers a variety of Lean Six Sigma tools and techniques and gives students the opportunity to team with local businesses to complete real quality improvement projects. Upon successful completion of the class, students even become certified green belts.

One item we talked about was how valuable the experiential component of the projects can be for students, especially in teaching them additional real-world lessons not found in a textbook. I’m a relative newbie myself to quality improvement and Lean Six Sigma techniques, and I couldn’t agree more with Ken! Learning by doing has been especially helpful for me thus far.

I also got the chance to talk with two of Ken’s students about the projects they completed for the class, and thought you might be able to learn a few things from them—I know I did!

Serving Those in Need with Better Inventory Management

Chaleise Everly is an accounting major at ISU who worked with the Lighthouse Mission, a non-profit group that serves those in need of food, shelter, and clothing in the Terre Haute, IN area, to establish an inventory management system for donations and sales at the group’s headquarters and various retail locations. “The Lighthouse Mission had a loosely based inventory system in place, but I was really starting from scratch,” says Everly. “There was no clear filing system for donations and sales, and most of the records were handwritten.”

Since few valid baseline measurements were in place to determine the current volume of donations and sales, Everly started by compiling and analyzing cash register records to get an idea of the average difference between sales recorded and donations made.

After forming a baseline and collecting current inventory data, she used Quality Companion to begin and structure her project. With Companion’s Project Roadmap, she was able to define and follow the DMAIC project phases, and organize and manage all of the tools she needed in one project file:

Project Roadmap in Quality Companion

With fishbone diagrams, she outlined the weaknesses in the current system for logging, sorting, and selling donations and was able to identify initial improvement opportunities. Everly then defined the current process by developing a process map, which helped her to further identify ways to streamline and eliminate waste from the process. Her analysis of the current process revealed that two retail locations were forgetting to record donations, and that many locations were improperly coding items and incorrectly recording costs.

Quality Companion Process Map

Everly used the information she gathered to develop a new process, which used a central computerized spreadsheet system for recording end-of-day inventory and sales for all of the Lighhouse Mission’s retail locations. The system was easy to use and made the process for submitting inventory information much more standardized, which improved the accuracy of their data. She then used Companion to develop a future-state process map for the updated processes.

From there, she used the Cause & Effect Matrix to analyze her future-state process map to identify the critical aspects of the new process, and tracked the progress and activities of the project using a Gantt Chart, which helped her to stay on task.

Although the Lighthouse Mission is still in the early stages of implementation, the group is already seeing the benefits of having established processes with uniformity across all locations. The new system is allowing the Lighthouse Mission to see a more complete picture of their inventory and sales data, which is helping them to make better business decisions and giving them the ability to assist more people in the Terre Haute area.

Improving Outpatient Therapy Services

Lisa Hammill, management information systems major at ISU, worked with a local hospital to conduct a Six Sigma project to improve the patient re-registration process for outpatient therapy services. The hospital had one on-site and three off-site locations for outpatient therapy, and wanted to streamline the complex process of re-registering patients at the beginning of each calendar year at each location. The hospital’s accounting procedures for outpatient therapy services required that all patient accounts be closed annually, then current patients had to be re-registered at the beginning of the year. This process involved manually inputting redundant patient information to create new patient account numbers.

Working alongside employees from outpatient therapy services, Hammill and her project team used the DMAIC approach to frame their project. Like Everly, Hammil used Quality Companion to manage her project. To help define the current process, she developed a current-state process map, and recorded the time staff spent at each process step. At the same time, Hammill and the team brainstormed improvements and opportunities for automation of the manual re-registration step, which was identified as a process bottleneck and source of wasted time.

“We used automation to our benefit, and worked with the scheduling department to pre-submit a list of the patients who would be coming in on subsequent days,” says Hammill. “This list of people could be re-registered in advance, completely eliminating the re-registration step from their outpatient therapy check-in process.”

The team also focused on eliminating other process steps that could free up registration staff to be readily available for other time-intensive check-in procedures, such as verifying patient insurance. With improvements to the process in place, Hammill and the team evaluated the updated re-registration process and noticed an improvement in the overall process time, as well as time saved at many of the individual process steps.

Using Minitab Statistical Software, Hammill created control charts to monitor variation in the new process, and performed additional analysis on the process times recorded before and after improvements to prove there was a statistically significant difference.

Minitab Control Chart

Not only was the project well-received by the hospital, but Hammill says the outpatient therapy department moved forward with the process changes she and the quality team recommended. “I think it opened their minds to what could be achieved with a Six Sigma project.”

Want to learn more? Check out the full case study: Indiana State University: Preparing Students for Lean Six Sigma on the Job

How did you learn Lean Six Sigma? Did you learn best by actually completing projects or did a more formal classroom approach help you the most?

A while back my colleague Jim Frost wrote about applying statistics to decisions typically left to expert judgment; I was reminded of his post this week when I came across a new research study that takes a statistical technique commonly used in one discipline, and applies it in a new way.

Hyena skulls: optimized for cracking bones! The study, by paleontologist Zhijie Jack Tseng, looked at how the skulls of bone-cracking carnivores--modern-day hyenas--evolved. They may look like dogs, but hyenas in fact are more closely related to cats. However, some extinct dog species had skulls much like a hyena's.

Tseng analyzed data from 3D computer models of theoretical skulls, along with those of existing species, to test the hypotheses that specialized bone-cracking hyenas and dogs evolved similar skulls with similar biting capabilities, and that the adaptations are optimized from an engineering perspective.

This paper is well worth reading, and if you're into statistics and/or quality, you might notice how Tseng uses 3D surface plots and contour plots to explore his data and explain his findings. That struck me because I usually see these two types of graphs used in the analysis of Design of Experiments (DoE) data, when quality practitioners are trying to optimize a process or product.

Two other factors make this even more cool: Tseng used Minitab to create the surface plots (sweet!), and his paper and data are available to everyone who would like to work with them. When I contacted him to ask if he'd mind us using his data to demonstrate how to create a surface plot, he graciously assented and added, "In the spirit of open science and PLoS ONE's mission, the data are meant for uses exactly like the one you are planning for your blog."

So let's make (and manipulate) a surface plot in Minitab using the data from these theoretical bone-cracking skulls. If you don't already have it, download our 30-day trial of Minitab Statistical Software and follow along!

Creating a 3D Surface Plot

Three-dimensional surface plots help us see the potential relationship between three variables. Predictor variables are mapped on the x- and y-scales, and the response variable (z) is represented by a smooth surface (surface plot) or a grid (wireframe plot). Skull deepening and widening are major evolutionary patterns in convergent bone-cracking dogs and hyaenas, so Tseng used skull width-to-length and depth-to-length ratios as variables to examine optimized shapes for two functional properties: mechanical advantage (MA) and strain energy (SE).

So, here's the step-by-step breakdown of creating a 3D surface plot in Minitab. We're going to use it to look at the relationship between the ratio of skull depth to length (D:L), width to length (W:L), and skull-strain energy (SE), a measure of work efficiency.

Download and open the worksheet containing the data.
Choose Graph > 3D Surface Plot.
Choose Surface, then click OK.
In Z variable, enter SE (J). In Y variable, enter D:L. In X variable, enter W:L.
Click Scale, then click the Gridlines tab.
I'm going to leave them off, but if you like, you can use Show gridlines for, then check Z major ticks, Y major ticks, and X major ticks. Adding the gridlines helps you visualize the peaks and valleys of the surface and determine the corresponding x- and y-values.
Click OK in each dialog box.

Minitab produces the following graph:

Surface plot of skull-strain energy to depth/length and width/length

The "landscape" of the 3D surface plot is illuminated in places so that you can better see surface features, and you can change the position, color, and brightness of these lights to better display the data. You also can change the pattern and color of the surface. You can open the "Edit Surface" dialog box simply by double-clicking on the landscape. Here, I've tweaked the colors and lighting a bit to give more contrast:

surface plot with alternate colors

Turn the Landscape Upside-Down

You may not want to go so far as to flip it, but rotating the graph to view the surface from different angles can help you visualize the peaks and valleys of the surface. You can rotate the graph around the X, Y, and Z axes, rotate the lights, and even zoom in with the 3D Graph Tools toolbar. (If you don't already see it, just choose Tools > Toolbars > 3D Graph Tools to make it appear.)

3D Graph Tools toolbar in statistical software

By rotating 3D surface and wireframe plots, you can view them from different angles, which often reveals interesting information. Changing these factors can help reveal different features of the data surface and dramatically impact what features are highlighted:

Rotated and illuminated surface plot

Off-Label Use of the Surface Plot?

Tseng notes that combining biomechanical analysis of the theoretical skulls and functional landscapes like the 3D surface plot is a novel approach to the study of convergent evolution, one that permits fossil species to be used in biomechanical simulations, and also provides comparative data about hypothesized form-function relationships. What did he find? He explained it this way in an interview:

What I found, using models of theoretical skulls and those from actual species, was that increasingly specialized dogs and hyenas did evolve stronger and more efficient skulls, but those skulls are only optimal in a rather limited range of possible variations in form. This indicates there are other factors restricting skull shape diversity, even in lineages with highly directional evolution towards biomechanically-demanding lifestyles...although the range of theoretical skull shapes I generated included forms that resemble real carnivore skulls, the actual distribution of carnivoran species in this theoretical space is quite restricted. It shows how seemingly plausible skull shapes nevertheless do not exist in nature (at least among the carnivores that I studied).

In addition to 3D surface plots, Tseng used contour plots to help visualize his theoretical landscapes. In my next post, I'll show how to create and manipulate those types of graphs in Minitab. Meanwhile, please be sure to check out his paper for the full details on Tseng's research:

Tseng ZJ (2013) Testing Adaptive Hypotheses of Convergence with Functional Landscapes: A Case Study of Bone-Cracking Hypercarnivores. PLoS ONE 8(5): e65305. doi:10.1371/journal.pone.0065305

A skull made for cracking some bones! Yesterday I wrote about how paleontologist Zhijie Jack Tseng used 3D surface plots created in Minitab Statistical Software to look at how the skulls of hyenas and some extinct dogs with similar dining habits fit into a spectrum of possible skull forms that had been created with 3D modelling techniques.

What's interesting about this from a data analysis perspective is how Tseng took tools commonly used in quality improvement and engineering and applied them to his research into evolutionary morphology.

We used Tseng's data to demonstrate how to create and explore 3D surface plots yesterday, so let's turn our attention to contour plots.

How to Create a Contour Plot

Like a surface plot, we can use a contour plot to look at the relationships between three variables on a single plot. We take two predictor variables (x and y) and use the contour plot to see how they influence a response variable (z).

A contour plot is like a topographical map in which x-, y-, and z-values substitute for longitude, latitude, and elevation. Values for the x- and y-factors (predictors) are plotted on the x- and y-axes, while contour lines and colored bands represent the values for the z-factor (response). Contour lines connect points with the same response value.

Since skull deepening and widening are major evolutionary trends in bone-cracking dogs and hyaenas, Tseng used skull width-to-length and depth-to-length ratios as variables to examine optimized shapes for two functional properties: mechanical advantage (MA) and strain energy (SE).

Here's how to use Minitab to create a contour plot like those in Tseng's paper.

Download and open the worksheet containing the data.
Choose Graph > Contour Plot.
In Z variable, enter SE (J). In Y variable, enter D:L. In X variable, enter W:L.
Click OK in the dialog box.

Minitab creates the following graph:

Contour Plot of Skull Strain Energy

Now, that looks pretty cool...but notice how close the gray and light green bands in the center are? It would be easier to distinguish them if we had clear dividing lines between the contours. Let's add them. We'll recreate the graph, but this time we'll click on Data View in the dialog box, and check the option for Contour Lines:

adding contour lines to contour plot

Click OK > OK, and Minitab gives us this plot, which is much easier to scan:

contour plot with contour lines

Refining and Customizing the Contour Plot

Now, suppose you've created this plot, as we did, with 9 contour levels for the response variable, but you really don't need that much detail? You can double-click on the graph to bring up the Edit Area dialog box, from which you can adjust the number of levels from 2 through 11. Here's what the graph looks like reduced to 5 contour levels:

Contour plot with five levels

Alternatively, we can specify which contour values to display. And if your boss (or funding agency) doesn't like green or blue, it's very easy to change the contour plot's palette. You can also adjust the type of fill used in specific contours:

Contour plot with custom palette and shading

Whoa.

Reading the Contour Plot

As noted earlier, we read the contour plot as if it were a topographical map: the contours indicate the "steepness" of the response variable, so we can look for:

X-Y "coordinates" that produce maximal or minimal responses in Z
Ridges" of high values or "valleys" of low values

It's easy to see from this demonstration why the contour plot is such a popular tool for optimizing processes: it drastically simplifies the task of identifying which values of two predictors lead to the desired values for a response, which would be a bit of a pain to do using just the raw data.

To see how Tseng used contour plots, check out his study:

Tseng ZJ (2013) Testing Adaptive Hypotheses of Convergence with Functional Landscapes: A Case Study of Bone-Cracking Hypercarnivores. PLoS ONE 8(5): e65305. doi:10.1371/journal.pone.0065305

Moneybag New Jersey Gov. Chris Christie is currently in a battle with sports leagues over the issue of allowing sports betting at casinos in Atlantic City and horse racing tracks across the state. If he wins and sports betting becomes legal in New Jersey, it will open the door for other states to follow suit. It appears there is a long way to go before this form of gambling spreads across the country.

But is sports betting really so much worse than casinos (which are legal in just under half of all U.S. states) or the lottery (which is legal in almost every U.S. state)? For the purposes of this discussion, we're going to ignore any moral and social issues and focus on just the statistics behind making each kind of bet.

If you had $10 burning a hole in your pocket, which form of gambling would be your best bet? I’m going to start by calculating the expected value for each one, and in subsequent posts we'll use Minitab Statistical Software and those expected values to see which type of bet is most risky.

Calculating Expected Values

An expected value is the amount that you’ll win “on average” on a single bet. For example, let’s bet on a coin flip. You bet $10 on tails (because tails never fails!). If it comes up tails you’ll profit by $10, otherwise you’ll lose $10. The probability of the coin coming up tails is 50%. So your expected value is:

(Odds of Winning)*(Profit) – (Odds of Losing)*(Amount Lost) = .5*$10 - .5*$10 = $5 - $5 = $0

On average, you won’t win or lose any money on this wager. Now we can apply this same formula to our three bets. I’ll start with the simplest, sports betting. Let’s say you bet the spread on a NFL game. I’ve previously found that betting on NFL games isn’t that different from betting on a coin flip, so I’m going to set the probability of winning the bet at 50%. However, the way the sportsbooks get you is that you’ll only profit $9.09 on your $10 bet. So that makes our expected value:

.5*$9.09 - .5*$10 = -$0.45

So on each $10 bet, you’ll lose about 45 cents. How will that compare to our other games? Let’s say that I want to try and win a little more money than just $9.09, so I walk into a casino and play a single number in roulette. With 38 different numbers, my probability of winning is 1/38 = 2.6%. Sounds low, but if my $10 bet wins, I’ll win $350! The other 97.4% of the time I’ll only lose $10. So what’s my expected value?

(1/38)*$350 – (37/38)*$10 = -$0.53

Despite the fact that I’ll win roulette much less frequently than my NFL bet, the payoff is so large that the expected value is about the same as football. But the value is still negative, so don’t think roulette is going to be a viable career path.

Least Controversial, But Most Expensive?

Let’s move on to the least controversial of these games of chance, the lottery. There are many different forms of the lottery, but to be consistent I’m going to pick a $10 scratch-off ticket called “Neon 9s” from my home state of Pennsylvania. The top prize is $300,000, with many other prizes ranging from $10 to $30,000. Since the odds for each prize are drastically different (you can find the complete list here), finding the expected value becomes much more complicated. But when you calculate it all out, you’ll find that the expected value of buying one ticket is:

-$2.78

Ouch! That’s much worse than the previous two expected values. You could make 6 sports bets or 5 roulette spins before you’d be expected to lose more money than buying one $10 scratch off ticket! The chance of winning that top prize may be alluring, but you sure pay a hefty price for that chance. If people lose so much more money (on average) playing the lottery, it makes you wonder why that form of gambling is "okay" and legal in almost every state, while the other two are often frowned upon and/or illegal in most states. Sending mixed messages, aren't we?

Anyway, it turns out that you'd lose your money least rapidly making a $10 sports bet. But this is all theoretical. Although the expected value is negative for each bet, people are still able to win when gambling. What would playing these games in the real world look like?

Say we take 300 different people and make them place a $10 bet every week for a year. One-third will make a $10 NFL wager, another third will bet $10 on a single roulette number, and the final third will buy the $10 Neon 9s lottery ticket.

Hmmm. I'm probably going to have a hard time convincing 300 of my friends, family and coworkers into joining this experiment. Luckily I can do it all in Minitab! So next week I'm going to come back and show you how to use Minitab to simulate making sports bets, playing roulette, and buying lottery tickets. Then I'm going to run my experiment, and see if anybody comes out ahead!

calling johnny A t-test is one of the most frequently used procedures in statistics.

But even people who frequently use t-tests often don’t know exactly what happens when their data are wheeled away and operated upon behind the curtain using statistical software like Minitab.

It’s worth taking a quick peek behind that curtain.

Because if you know how a t-test works, you can understand what your results really mean. You can also better grasp why your study did (or didn’t) achieve “statistical significance.”

In fact, if you’ve ever tried to communicate with a distracted teenager, you already have experience with the basic principles behind a t-test.

Anatomy of a t-test

A t-test is commonly used to determine whether the mean of a population significantly differs from a specific value (called the hypothesized mean) or from the mean of another population.

For example, a 1-sample t-test could test whether the mean waiting time for all patients in a medical clinic is greater than a target wait time of, say, 15 minutes, based on a random sample of patients.

anatomy ttest To determine whether the difference is statistically significant, the t-test calculates a t-value. (The p-value is obtained directly from this t-value.) To find the formula for the t-value, choose Help > Methods and Formulas in Minitab, then click Basic statistics > 1-sample t > Test statistic. Here's what you'll see:

t test formula

That jumble of letters and symbols may look like an incantation from a sorcerer’s book.

But the formula is much less mystical if you remember there are two driving forces behind it: the numerator (top of the fraction) and the denominator (bottom of the fraction).

The Numerator Is the Signal

The numerator in the 1-sample t-test formula measures the strength of the signal: the difference between the mean of your sample (xbar) and the hypothesized mean of the population (µ0).

signal

Consider the patient waiting time example, with the hypothesized mean wait time of 15 minutes.

If the patients in your random sample had a mean wait time of 15.1 minutes, the signal is 15.1-15 = 0.1 minutes. The difference is relatively small, so the signal in the numerator is weak.

However, if patients in your random sample had a mean wait time of 68 minutes, the difference is much larger: 68 - 15 = 53 minutes. So the signal is stronger.

The Denominator is the Noise

The denominator in the 1-sample t-test formula measures the variation or “noise” in your sample data.

noise

S is the standard deviation—which tells you how much your data bounce around. If one patient waits 50 minutes, another 12 minutes, another 0.5 minutes, another 175 minutes, and so on, that’s a lot of variation. Which means a higher s value—and more noise. If, on the other hand, one patient waits 14 minutes, another 16 minutes, another 12 minutes, that’s less variation, which means a lower value of s, and less noise.

What about the √n (below the s)? That’s the square root of your sample size. What that does, very loosely speaking, is “average” out the variation based on the number of data values in the sample. So, all things being equal, a given amount of variation is “noisier” for a smaller sample than for a larger one.

The t-Value: The Ratio of Signal to Noise

As the above formula shows, the t-value simply compares the strength of the signal (the difference) to the amount of noise (the variation) in the data.

If the signal is weak relative to the noise, the (absolute) size of the t-value will be smaller. So the difference is not likely to be statistically significant:

insignificant results

On the graph at right, the difference between the sample mean (xbar) and the hypothesized mean (µ0) is about 16 minutes. But because the data is so spread out, this difference is not statistically significant. Why? The t-value—the ratio of signal to noise—is relatively small due to the large denominator.

However, if the signal is strong relative to the noise, the (absolute) size of the t-value will be larger. So the difference between xbar and the µ0 is more likely to be statistically significant:

significant results

On this graph, the difference between the sample mean (xbar) and the hypothesized mean (µ0) is the same as on the previous graph—about 16 minutes. The sample size is also the same. But this time, the data is much more tightly clustered. Due to less variation, the same difference of 16 minutes is now statistically significant!

Statistically Significant Messages

So how is the t-test like telling a teenager to clean up the mess in the kitchen?

If the teenager is listening to music, playing a video game, texting friends, or distracted by any of the other umpteen sources of "noise" that pervade our lives, the louder and stronger you need to make your verbal signal to achieve "significance." Alternatively, you could insist on removing those sources of extraneous noise before you communicate—in which case you wouldn't need to raise your voice at all.

Similarly, if your t-test results don't achieve statistical significance, it could be for any of the following reasons:

The difference (signal) isn't large enough. Nothing you can do about that, assuming that your study is properly designed and you've collected a representative sample.
The variation (noise) is too great. This is why it's important to remove or account for extraneous sources of variation when you plan your analysis. For example, you could use a control chart to identify and eliminate sources of special-cause variation from your process before you collect data for a t-test on the process mean.
The sample is too small. Remember the effect of variation is lessened by sample size. That means for a given difference and a given amount of variation, a larger sample is more likely to achieve statistical significance, as shown in this graph:

t vs n

(This effect also explains why an extremely large sample can produce statistically significant results even when a difference is very small and has no practical consequence.)

By the way, in case you're wondering, these basic relationships are similar for a 2-sample t-test and a paired t- test. Although their formulas are a bit more complex (see Help > Methods and Formulas> Statistics> Basic Statistics), the basic driving forces behind them are essentially the same.

These formulas also explain why statisticians often cringe in response to the language sometimes used to convey t-test results. For example, a statistically insignificant t-test result is often reported by stating, “There is no significant difference...”

Literally speaking, it ain't necessarily so.

There may actually be a significant difference. But because your sample was too small, or because extraneous variation in your study was not properly accounted for, your study wasn't able to demonstrate statistical significance. You're on safer ground saying something like "Our study did not find evidence of a statistically signficant difference."

Now, if you're still with me, you might be asking, but why is it called a t-test? And where does the p-value come from? You haven't explained any of that!

Sorry…I’m out of space and time. I’ll talk about those concepts in an upcoming post.

[no response]

JOHNNY, I SAID, I'LL TALK ABOUT THOSE CONCEPTS IN AN UPCOMING POST!!!!

Moneybag I previously started looking into which method of gambling was your best bet: a NFL bet, a number on a roulette wheel, or a scratch-off lottery ticket. After calculating the expect value for each one, I found out that the NFL bet and roulette bet were similar, as each had an expected value close to -$0.50 on a $10 bet. The scratch-off ticket was much worse, having an expected value of -$2.78.

But I want see how each of these games could play out in real life. After all, it is possible for people to come out ahead playing each game. So I planned to take 300 people, split them into 3 groups (one for each game), and have each group make a $10 bet once a day for a year.

After a failed attempt to find 300 friends, family members, and coworkers to agree to gamble $10 a week for a year, I realized I was going to have to simulate the gambling myself. Luckily Mintiab Statistical Software has a set of tools that made this very easy! So I’m going to show you exactly how I did. Then you, too, can start your own underground casino...uh, I mean, run an experiment to see how different types of bets play out in the long run.

We better just get to the simulation. If you want to follow along and don't already have it, download Minitab's free 30-day trial.

Simulating the NFL Bets

Let’s start with the football bet. We know there are two outcomes, either winning $9.09 or losing $10. So in a column called “Football,” I type “9.09” and “-10” into the first two rows. We can now take a random sample from this column because 50% of the time we'll win $9.09, and 50% of the time we'll lose $10. So let's have our 100 people make their bets! With 52 weeks in a year, and with 100 people making one bet a week, that’s a total of 5,200 total bets. To simulate that I went to Calc > Random Data > Sample From Columns and filled out the dialog as follows.

Simulate Football Bets

Finally, I needed a column for my 100 people! To do that, I went to Calc > Make Patterned Data > Simple Set of Numbers and filled out the dialog as follows:

100 People

So now I have a column called "Football Winnings" that has the outcome of my 5,200 football bets, and another column called "Person" that has the numbers 1 through 100 (each number representing a different person),each listed 52 times. Voila! It's as if we had 100 people making a football bet one day for a year!

Too bad there isn't actually a football game to bet on each day of the year.

Simulating the Roulette Bets

For the roulette bets I followed the same steps as the football bets. The only difference (besides having different column names) was that the "Roulette" column had more than two observations. The odds of winning are 1 out of 38, not 1 out of 2! To make this work out, I entered 350 in the first row, and -10 in the next 37. Now when we sample from this column, we'll win $350 1/38 of the time, and lose $10 37/38 of the time.

Simulating the Lottery

The lottery simulation became much more complicated. After some math, I found that if there were 1,440,000 tickets, there would be exactly:

2 tickets that won $300,000
4 tickets that won $30,000
8 tickets that won $10,000
480 tickets that won $1000
960 tickets that won $500
And so on, until you got to 1,020,919 tickets that lost $10.

In order to have a column that accurately reflects the odds of winning each prize, I need a column with 299,990 listed twice (the amount you profit), $29,990 listed 4 times, $9,999 listed 8 times, all the way to -10 listed over a million times. In total the column will have 1,440,000 rows. I’m definitely not typing all that in!

To save myself from a lot of painful data entry, I once again turned to Calc > Make Patterned Data > Simple Set of Numbers. But this time I made a column for each prize. For example, to get the 1,020,919 tickets that lost $10 I filled out the dialog as follows to get a “-10” column.

I did the same thing for each prize amount, and ended up with 10 columns (there are 12 different prize amounts, but I combined the top 3 prizes into a single column). Then I used Data > Stack > Columns to combine all the columns into a single column. But can Minitab support a single column with almost a million and a half rows? I’m about to find out.

Stack Columns

A column with 1,440,000 rows!

It worked! After stacking the columns, I was able to create a single “Lottery” column with 1,440,000 rows! Now I can simulate my lottery tickets just like I did with the previous two bets!

Now the hard part is behind us! We have all of our data in Minitab, so all we have to do is perform a data analysis on the results. I'm going to make one more post to show how our 300 people did. So be sure to check back tomorrow to see if anybody got lucky and won big!

Read Part I of this series

Read Part III of this series

Moneybags Have you heard about the Tennessee man who has 22 children to 17 different women? He was interviewed the other day, and when asked how he supports all his kids he was quoted as saying:

"I'm just hoping one day I'll get lucky and might scratch off the numbers or something. I play the hell out of the Tennessee lottery."

Well, what would it look like if a person really did play "the hell" out of the lottery? Say you spent a year buying one $10 scratch-off ticket each day. How likely would you be to come out ahead? And for that matter, how would the lottery compare to making a $10 sports bet or a $10 roulette bet?

I already found that the expected value is negative for each game. But instead of just calculating the average, I want to see how this would play out in real life.

Luckily, instead of actually spending money myself, I used Mintiab Statistical Software to simulate the bets for me. But I didn't stop at 1 person. Since I was simulating the bets, I had 300 people make a $10 bet once a day for a year. 100 of them bet on a football game, 100 played a single number of roulette, and 100 bought a $10 scratch off lottery ticket called Neon 9s.

Let's get to the results and see how they did.

Did I Win?

Before we get to the results, let’s see what we would expect. Our football betters have an expected value of -$0.45, so by making 52 bets we would expect them to lose $23.40 on average. We would expect roulette betters to lose $27.56 on average, and the lottery players to each lose $144.56 on average.

Now let’s see how our random data turned out! I summarized the winnings for each person below.

Summary

We see that the football and lottery groups both lost about as much money as we would expect. But just because the average was negative doesn’t mean everybody lost! We can look at the maximum to see the biggest winner. A football bettor won $167.24 over the course of the year and a lottery player won $890. Just because the number are against you doesn’t mean it’s impossible to win.

Speaking of impossible…the roulette group has a positive average! That means the 100 people making the roulette bets won an average of $16.40 per person! As a group they won $1,640. At first I thought I messed something up. But looking at the results, our roulette players won only 149 times out of 5,200 tries. That’s 2.87%, and a One Proportion test shows that statistically it isn’t significantly different from the real value of 2.63%

One Proportion

And yet that small random departure of 0.24% from the average was enough to win our group some money! However…it wasn’t distributed evenly. The results below show the breakdown of the different amounts won and lost by our roulette group:

Tally

We see that 29 people lost all 52 bets, while another 26 lost 51 of them. That’s over half of who didn’t come out ahead! And the three people who won $1,280 only won 5 of their 52 bets. When the payout is that high, it doesn't take many wins to rack up the money.

So I Should Run to the Nearest Roulette Table, Right?

No, you shouldn’t be in too much of a hurry. The One Proportion test had a p-value of .155, meaning there was only a 15.5% chance of the roulette bettors winning as much as they did. While it’s not impossible that might happen again, it’s not likely.

But let's compare all 3 groups again. We already saw that despite having the group win money overall, only 45 of the 100 people came out ahead. Seeing as how the football and lottery groups both lost money, I would assume the number of people that came out ahead was a lot lower. Let's look at the results.

Tally

There were 28 people who won money overall in our football group, and only 8 people who won money overall in the lottery! And we already saw that the maximum win for the lottery group was $890. That means despite buying 5,200 tickets, nobody won the three largest prizes ($10,000, $30,00, and $300,00). Those prizes are so big that it's tempting to buy a ticket, but the odds of winning them are so low that, as far I'm concerned, it's not worth it.

I think the biggest takeaway is how much of a difference there was between the lottery group and the other two groups. As a group, the lottery players lost $15,370, and only 8 people came out ahead! The other groups don’t even come close to that. It just shows how that difference in expected value can really add up. And perhaps somebody should tell that Tennessee man that investing his child support money in the Tennessee lottery isn't the best of ideas.

If you really want to have money, it looks like saving it is your best bet.

Read Part I of this series

Read Part II of this series

Fitted line plot that illustrates an overfit model Multiple regression can be a beguiling, temptation-filled analysis. It’s so easy to add more variables as you think of them, or just because the data are handy. Some of the predictors will be significant. Perhaps there is a relationship, or is it just by chance? You can add higher-order polynomials to bend and twist that fitted line as you like, but are you fitting real patterns or just connecting the dots? All the while, the R-squared (R2) value increases, teasing you, and egging you on to add more variables!

Previously, I showed how R-squared can be misleading when you assess the goodness-of-fit for linear regression analysis. In this post, we’ll look at why you should resist the urge to add too many predictors to a regression model, and how the adjusted R-squared and predicted R-squared can help!

Some Problems with R-squared

In my last post, I showed how R-squared cannot determine whether the coefficient estimates and predictions are biased, which is why you must assess the residual plots. However, R-squared has additional problems that the adjusted R-squared and predicted R-squared are designed to address.

Problem 1: Every time you add a predictor to a model, the R-squared increases, even if due to chance alone. It never decreases. Consequently, a model with more terms may appear to have a better fit simply because it has more terms.

Problem 2: If a model has too many predictors and higher order polynomials, it begins to model the random noise in the data. This condition is known as overfitting the model and it produces misleadingly high R-squared values and a lessened ability to make predictions.

What Is the Adjusted R-squared?

The adjusted R-squared compares the explanatory power of regression models that contain different numbers of predictors.

Suppose you compare a five-predictor model with a higher R-squared to a one-predictor model. Does the five predictor model have a higher R-squared because it’s better? Or is the R-squared higher because it has more predictors? Simply compare the adjusted R-squared values to find out!

The adjusted R-squared is a modified version of R-squared that has been adjusted for the number of predictors in the model. The adjusted R-squared increases only if the new term improves the model more than would be expected by chance. It decreases when a predictor improves the model by less than expected by chance. The adjusted R-squared can be negative, but it’s usually not. It is always lower than the R-squared.

In the simplified Best Subsets Regression output below, you can see where the adjusted R-squared peaks, and then declines. Meanwhile, the R-squared continues to increase.

Best subsets regression example

You might want to include only three predictors in this model. In my last blog, we saw how an under-specified model (one that was too simple) can produce biased estimates. However, an overspecified model (one that's too complex) is more likely to reduce the precision of coefficient estimates and predicted values. Consequently, you don’t want to include more terms in the model than necessary. (Read an example of using Minitab’s Best Subsets Regression.)

What Is the Predicted R-squared?

The predicted R-squared indicates how well a regression model predicts responses for new observations. This statistic helps you determine when the model fits the original data but is less capable of providing valid predictions for new observations. (Read an example of using regression to make predictions.)

Minitab calculates predicted R-squared by systematically removing each observation from the data set, estimating the regression equation, and determining how well the model predicts the removed observation. Like adjusted R-squared, predicted R-squared can be negative and it is always lower than R-squared.

Even if you don’t plan to use the model for predictions, the predicted R-squared still provides crucial information.

A key benefit of predicted R-squared is that it can prevent you from overfitting a model. As mentioned earlier, an overfit model contains too many predictors and it starts to model the random noise.

Because it is impossible to predict random noise, the predicted R-squared must drop for an overfit model. If you see a predicted R-squared that is much lower than the regular R-squared, you almost certainly have too many terms in the model.

Examples of Overfit Models and Predicted R-squared

You can try these examples for yourself using this Minitab project file that contains two worksheets. If you want to play along and you don't already have it, please download the free 30-day trial of Minitab Statistical Software!

There’s an easy way for you to see an overfit model in action. If you analyze a linear regression model that has one predictor for each degree of freedom, you’ll always get an R-squared of 100%!

In the random data worksheet, I created 10 rows of random data for a response variable and nine predictors. Because there are nine predictors and nine degrees of freedom, we get an R-squared of 100%.

R-squared of 100% for an overfit model

It appears that the model accounts for all of the variation. However, we know that the random predictors do not have any relationship to the random response! We are just fitting the random variability.

That’s an extreme case, but let’s look at some real data in the President's ranking worksheet.

These data come from my post about great Presidents. I found no association between each President’s highest approval rating and the historian’s ranking. In fact, I described that fitted line plot (below) as an exemplar of no relationship, a flat line with an R-squared of 0.7%!

Fitted line plot of historian's rank by each President's highest rating

Let’s say we didn’t know better and we overfit the model by including the highest approval rating as a cubic polynomial.

Fitted line plot that shows an overfit model

Output for the overfit model of Presidential rankings

Wow, both the R-squared and adjusted R-squared look pretty good! Also, the coefficient estimates are all significant because their p-values are less than 0.05. The residual plots (not shown) look good too. Great!

Not so fast...all that we're doing is excessively bending the fitted line to artificially connect the dots rather than finding a true relationship between the variables.

Our model is too complicated and the predicted R-squared gives this away. We actually have a negative predicted R-squared value. That may not seem intuitive, but if 0% is terrible, a negative percentage is even worse!

The predicted R-squared doesn’t have to be negative to indicate an overfit model. If you see the predicted R-squared start to fall as you add predictors, even if they’re significant, you should begin to worry about overfitting the model.

Closing Thoughts about Adjusted R-squared and Predicted R-squared

All data contain a natural amount of variability that is unexplainable. Unfortunately, R-squared doesn’t respect this natural ceiling. Chasing a high R-squared value can push us to include too many predictors in an attempt to explain the unexplainable.

In these cases, you can achieve a higher R-squared value, but at the cost of misleading results, reduced precision, and a lessened ability to make predictions.

Both adjusted R-squared and predicted R-square provide information that helps you assess the number of predictors in your model:

Use the adjusted R-square to compare models with different numbers of predictors
Use the predicted R-square to determine how well the model predicts new observations and whether the model is too complicated

Regression analysis is powerful, but you don’t want to be seduced by that power and use it unwisely!

Miami Heat My family moved to Los Angeles in 1987, just as the Los Angeles Lakers were in the midst of winning back-to-back championships. While I don’t consider myself a huge basketball fan, the NBA finals always hold some interest for me. If you get to watch James Worthy, Michael Cooper, Byron Scott, A.C. Green, Magic Johnson, and Kareem Abdul-Jabbar win championships, it sticks with you.

So now that the Spurs and Heat are competing for the 2013 edition of the NBA championships, I get a little drawn in by the excitement. One of the interesting occurrences from this year’s finals is that the two teams have split the first two games of the series. Listening to ESPN radio, Mike Greenberg mentioned that when the NBA finals are tied 1-1, the winner of game 3 wins the series 92% of the time.

This 92% statistic that’s been such a hot topic this week uses data since the 1985 finals, when the finals switched to being 2 home games, 3 away games, 2 home games for the team with the better record, so I was curious to see whether I can spot a different trend before and after 1985.

I went back to 1967, when the NBA playoffs expanded to 8 teams for the first time. (If you’re into trivia, it was also the first time in 11 years that the Celtics weren’t in the finals.) Since 1967, teams split the first two games in 24 NBA championships.

A graphical analysis of data is always a good way to start. Because the NBA finals have a time component to them, a time series plot lets us check for patterns over time. Plotting the data in time order shows the trend that has been such a strong point of conversation since the Heat won game 2 of the finals. The time series plot below shows a 1 if the team that won game 3 also won the series and a 0 if the team that lost game 3 won the series. The gaps are for years where a team went up 2 games to none.

Game 3 winners

The time series plot illustrates how strong the advantage has been for the game 3 winners. You can see the comparisons that are in the table below:

Game 3 winners

Game 3 losers

NBA Championships

NBA Championships before 1985

NBA Championships since 1985

Longest streak

Three NBA finals that have been tied 1-1 are particularly easy for me to recall: the 2012 series when the Heat defeated the Thunder in 5 games, the 2001 finals when the Lakers defeated the 76ers in 5 games, and the 1991 finals when the Bulls defeated the Lakers in 5 games. In each of those finals, the team that won the series lost the first game. Because those finals stood out to me, I wondered whether you could find another pattern in series that are tied 1-1: does the eventual winner win the second game more often than the first game?

The following time series plot shows a 1 if the game 1 winner won the series, and a 0 if the game 2 winner won the series.

Game 1 and 2 winners

If that doesn’t look like much to you, you’re right. There are no streaks long enough to seem unusual, but just enough that the oscillation doesn’t seem strange either. In terms of the numbers of finals victories, they’re close in the most obvious ways to look at them. Here’s a table with some comparisons:

Game 1 winners

Game 2 Winners

NBA Championships

NBA Championships before 1985

NBA championships since 1985

Longest streak

Game 3 victors who lost finals

Even the cases where teams won game 3 and went on to lose the finals (the blue diamonds on the plot) aren’t all on the same side. The plot suggests that it doesn’t matter which of the first two games you win, as long as you get to 2 wins first.

This year, the Spurs won game 3. While a game 4 victory gives the Heat renewed hope, history is still on the side of Tony Parker, Time Duncan, and Manu Ginobili in their quest to win a fourth championship together. It also pushes them closer to the record for playoff wins by three teammates playing together, currently 110 and held by my favorite trio: Magic Johnson, Kareem Abdul-Jabbar, and Michael Cooper.

Finding yourself with a new interest in time series plots? You can check out some quick tips for the time series plot and some other graphs on the Lessons from Minitab Help page!

The image of the Miami Heat is by Keith Allisonand licensed for reuse under thisCreative Commons License.

When it comes to creating control charts, it's generally good to collect data in subgroups, if possible. But sometimes gathering subgroups of measurements isn't an option. Measurements may be too expensive. Production volume may be too low. Products may have a long cycle time.

In many of those cases, you can use an I-MR chart. Like all control charts, the I-MR chart has three main uses:

Monitoring the stability of a process.
Even very stable processes have some variation, and when you try to fix minor fluctuations in a process you can actually cause instability. An I-MR chart can alert you to changes that reveal a problem you should address.
Determining whether a process is stable and ready to be improved.
When you change an unstable process, you can't accurately assess the effect of the changes. An I-MR chart can confirm (or deny) the stability of your process before you implement a change.
Demonstrating improved process performance.
Need to show that a process has been improved? Before-and-after I-MR charts can provide that proof.

The I-MR is really two charts in one. At the top of the graph is an Individuals (I) chart, which plots the values of each individual observation, and provides a means to assess process center.

I chart

The bottom part of the graph is a Moving Range (MR) chart, which plots process variation as calculated from the ranges of two or more successive observations.

MR Chart

The green line on each chart represents the mean, while the red lines show the upper and lower control limits. An in-control process shows only random variation within the control limits. An out-of-control process has unusual variation, which may be due to the presence of special causes.

Creating the I-MR Chart

Let's say you work for a chemical company, and you need to assess whether the pH value for a custom solution is within acceptable limits. The solution is made in batches, so you can only take one pH measurement per batch and the data cannot be subgrouped. This is an ideal situation for an I-MR chart.

pH data

So you measure pH for 25 consecutive batches. Preparing this data for the I-MR chart couldn't be easier: just list your measurements in a single column, in the order you collected them. (To follow along, please download this data set and, if you don't already have it, the free trial of our statistical software.)

Choose Stat > Control Charts > Variables Charts for Individuals > I-MR and select pH as the Variable. If you enter more than one column in Variables, no problem -- Minitab will simply produce multiple I-MR charts. Dialog box options let you add labels, split the chart into stages, subset the data , and more.

You're want to catch any possible special cause of variation, so click I-MR Options, and then choose Tests. Choose "Perform all tests for special causes," and then click OK in each dialog box.

tests for special causes

The tests for special causes detect points beyond the control limits and specific patterns in the data.

When an observation fails a test, Minitab reports it in the Session window and marks it on the I chart. A failed point indicates a nonrandom pattern in the data that should be investigated.
When no points are displayed under the test results, no observations failed the tests for special causes.

Interpreting the I-MR Chart, part 1: The MR Chart

Here's the I-MR chart for your pH data:

I-MR Chart of pH

First examine the MR chart, which tells you whether the process variation is in control. If the MR chart is out of control, the control limits on the I chart will be inaccurate. That means any lack of control in the I chart may be due to unstable variation, not actual changes in the process center. If the MR chart is in control, you can be sure that an out-of-control I chart is due to changes in the process center.

Points that fail Minitab's tests are marked with a red symbol on the MR chart. In this MR chart, the lower and upper control limits are 0 and 0.4983, and none of the individual observations fall outside those limits.The points also display a random pattern. So the process variation is in control, and it is appropriate to examine the I Chart.

Interpreting the I-MR Chart, part 2: The I Chart

The individuals (I) chart assesses whether the process center is in control. Unfortunately, this I chart doesn't look as good as the MR chart did:

I chart of pH

Minitab conducts up to eight special-cause variation tests for the I chart, and marks problem observations with a red symbol and the number of the failed test. The graph tells you three observations failed two tests. The Minitab Session Window tells you why each point was flagged:

Test Results for I Chart

Observation 8 failed Test 1, which tests for points more than 3 standard deviations from the center line -- the strongest evidence that a process is out of control. Observations 20 and 21 failed Test 5, which tests for a run of two out of three points with the same sign that fall more than two standard deviations from the center line. Test 5 provides additional sensitivity for detecting smaller shifts in process mean.

This I-MR chart indicates that the process average is unstable and the process is out of control, possibly due to the presence of special causes.

Now What?

The I-MR chart for pH may not be what you wanted to see, but now you know there may be a problem that needs to be addressed. That's the whole purpose of the control chart! Next, you can try to identify and correct the factors contributing to this special-cause variation. Until these causes are eliminated, the process cannot achieve a state of statistical control.

LeBron James has just captured his 2nd NBA Championship in as many years, and has secured himself a place as one of the greatest basketball players of all time. And he even did so by overcoming the “Winner of Game 3 wins the series 92% of the time” odds.

With the victory, there is a 99% chance the “LeBron is a choker and can’t win the big one” narrative is dead and gone (I say 99% because I’ll never underestimate the ability of Skip Bayless to find a new way to beat a dead horse). But that means that there is another narrative that is going to start being thrown around.

Is LeBron James better than basketball’s greatest player of all time….Michael Jordan?

Six championships, 10 time NBA scoring champion, never lost in the finals, multiple game winning shots, scored 38 points while he had the flu…..the list of Michael Jordan’s accomplishments goes on and on. A comparison of LeBron to Jordan is almost unfair. Actually, it is unfair! To simply say “Get back to me when LeBron has 6 championships” is a lazy and flawed argument. Jordan had his entire career to compile all those championships. At 28 years old, LeBron is only partway though his. In fact, do you know how many NBA Championships Jordan had at 28? Justone! So to accurately compare the two players, I figure we need to do one of the following:

Obtain a time machine to travel into the future and see what LeBron does between now and the end of his basketball career.
See what Jordan had done when he was at the same point in his career as LeBron is now.

I did spend some time looking for a time machine, but I wasn’t able to locate one. So it looks like we’re going to have to go with the second option.

Because both players came into the NBA at different ages, I’m not going to use what Jordan had done by the time he was 28. Instead, I’m going to use the number of seasons they’ve been in the league. LeBron has been in the NBA for 10 seasons, so I’m only going to look at what Jordan did in his first 10 seasons.

Round 1: How far did you lead your team?

Let’s start by seeing how far each player went in the playoffs throughout their first 10 seasons. We’ll use Minitab to make a pie chart to display the results. You can get the data here to follow along, and grab the free trial of our statistical software if you don't already have it.

Pie Chart

After 10 seasons, Jordan had only 3 of his 6 championships under his belt. Michael actually started his career with 3 straight first round exits and a mere one playoff win. His next three years all included playoff losses to the Detroit Pistons (once in the 2nd round and twice in the Eastern Conference Finals). The last “non-championship” season was in 1995, after his first 3 championships. He returned from retirement with 17 games left in the season and immediately led the Bulls on a 13-4 run going into the playoffs. But they lost to the Orlando Magic in the 2nd round.

As for LeBron, his pie chart has something on it that Michael never did: a season without a playoff appearance. In fact, LeBron didn’t make the playoffs his first two seasons in the league. LeBron does have more NBA Finals appearances, but he failed to win two of those. Losing in the Finals is also something Jordan never did.

Overall, Jordan doesn’t have as big of an advantage as one might expect here. But he does have one more championship and 0 seasons where he missed the playoffs. Round 1 goes to Jordan.

Round 2: Who can score more points?

You can’t win unless you score more points than the other team, so let’s look at the scoring averages for each player. I’m sticking to only playoff games, since that is where legacies are made. I also used points per 36 minutes, so that overtime games don’t cloud the results. Here is a time series plot of how each player performed in the playoffs as their careers progressed. (LeBron missed the playoffs his first two seasons, which is why he has two fewer observations.)

Time Series Plot

Jordan clearly looks like the better player here. His rookie season beats out all but 3 of LeBron’s seasons. And LeBron has never been able to top Jordan when they were at similar points in their career.

Another interesting thing to note is that after his first two seasons, Jordan pretty consistently scored about 30 points per 36 minutes in the playoffs. Meanwhile, LeBron is all over the place. That inconsistency is part of the reason he got so much criticism in the past. People expect great players to be great all the time. And if they’re not, well, just ask LeBron James what it was like to be him 2 seasons ago.

Round 2 goes to Jordan.

Round 3: How efficient were you?

How many points you score doesn’t always tell the entire story. If you score 37 points, but it takes you 50 shots to do so, then it really isn’t all that impressive. You didn’t have a great game, you just shot the ball a lot. So to look at each player’s efficiency, I’m going to use a stat called the Player Efficiency Rating (PER). It’s a measure of per-minute production standardized so that the league average is 15. Michael might have scored more than LeBron, but was he more efficient in doing so? Again, I’m only using playoff games.

Time Series Plot

Jordan isn’t as dominant as he was in the previous plot, but he still looks like the winner. Michael’s least efficient playoff performance still beats half of LeBron’s. And at similar points in their career Jordan wins 5 of the 8 seasons.

But James does have the most efficient playoff performance of either player. It was the 2009 playoffs, when he led Cleveland to the Eastern Conference Finals only to be upset by the Orlando Magic. In those playoffs, James had three 40-point games and never scored fewer than 25 points. And he did it all while shooting over 50%. Plus, in an elimination game against Orlando he had a triple double and staved off elimination for the Cavs (at least for one game). But after a playoff performance that far exceeded anything Jordan did, LeBron started to get stuck with the “unclutch” label. Sports media everybody!

But still, Jordan has the slight edge here.

Round 4: Can LeBron at least win the regular season?

So far we’ve only looked at what each player has done in the postseason. But it would be mistake to completely ignore the huge sample of games in the regular season, so let’s look at each player’s PER in the regular season.

Time Series Plot

You can clearly see how both players improved in the first few seasons of their career. But Jordan started out playing much better early in his career than LeBron. Then both players followed similar paths after their first 5 seasons. And yes, Jordan has a dip in season 10, but that was when he returned from retirement for 17 regular season games after not playing basketball for 2 years. I wouldn’t put too much stock in that data point.

Considering Jordan wins 7 of the 10 seasons and came into the league playing at a much higher level, he wins yet another round.

Conclusion: It’s really, really hard to be like Mike

LeBron James is definitely one of the best players ever to play the game. But this statistical analysis shows how hard it is to match what Michael Jordan did. And keep in mind that after his 10th season, all Jordan did was win three more titles in a row. LeBron is young enough that he is more than capable of matching that feat. But even if he does, he’ll be hard pressed to do it with the same efficiency that Jordan did.

So at this point in his career, LeBron doesn’t quite measure up to Michael Jordan, and it’s doubtful he ever will. But that doesn’t take away from how great LeBron really is. While he may not be the greatest player of all time, he’s currently the greatest player in the world. So instead of expecting him to live up to unreasonable expectations, we should just sit back, relax, and enjoy the show.

Photograph "Lbjheat" by Keith Allison. Licensed under Creative Commons Attribution ShareAlike 2.0.

Denim Here are seven quality improvement tools I see in action again and again. Most of these quality tools have been around for a while, but that certainly doesn’t take away any of their worth!

The best part about these tools is that they are very simple to use and work with quickly in Minitab Statistical Software or Quality Companion, but of course you can use other methods, or even pen and paper.

1. Fishbone Diagram

Fishbones, or cause-and-effect diagrams, help you brainstorm potential causes of a problem and see relationships among potential causes. The fishbone below identifies the potential causes of delayed lab results:

Fishbone in Quality Companion

On a fishbone diagram, the central problem, or effect, is on the far right. Affinities, which are categories of causes, branch from the spine of the central effect. The brainstormed causes branch from the affinities.

2. Control Chart

Common Components of a Control chart

Control charts are used to monitor the stability of processes, and can turn time-ordered data for a particular characteristic—such as product weight or hold time at a call center—into a picture that is easy to understand. These charts indicate when there are points out of control or unusual shifts in a process.

(My colleague has gone into further detail about the different types of control charts in another post.)

3. Histogram

You can use a histogram to evaluate the shape and central tendency of your data, and to assess whether or not your data follow a specific distribution such as the normal distribution.

Bars represent the number of observations falling within consecutive intervals. Because each bar represents many observations, a histogram is most useful when you have a large amount of data.

Minitab Histogram

4. Process Map

A process map, sometimes called a flow chart, can be used to help you model your process and understand and communicate all activities in the process, the relationships between inputs and outputs in the process, and key decision points.

Process Map in Quality Companion

Quality Companion makes it easy to construct high-level or detailed flow charts, and there’s also functionality to assign variables to each shape and then share them with other tools you’re using in Companion.

5. Pareto Chart

Pareto charts can help you prioritize quality problems and separate the “vital few” problems from the “trivial many” by plotting the frequencies and corresponding percentages of a categorical variable, which shows you where to focus your efforts and resources.

Minitab Pareto Chart

For a quick and fun example, download the 30-day trial version of Minitab (if you don’t have it already), and follow along with Pareto Power! or Explaining Quality Statistics So Your Boss Will Understand: Pareto Charts.

6. Run Chart

You can use a run chart to display how your process data changes over time, which can reveal evidence of special cause variation that creates recognizable patterns.

Minitab's run chart plots individual observations in the order they were collected, and draws a horizontal reference line at the median. Minitab also performs two tests that provide information on non-random variation due to trends, oscillation, mixtures, and clustering -- patterns that suggest the variation observed is due to special causes.

Run Chart Example

7. Scatter Plot

You can use a scatter plot to illustrate the relationship between two variables by plotting one against the other. Scatterplots are also useful for plotting a variable over time.

Minitab Scatterplot

What quality tools do you keep in your back pocket?

"Variability is the enemy" All processes are affected by various sources of variations over time. Products which are designed based on optimal settings, will, in reality, tend to drift away from their ideal settings during the manufacturing process.

Environmental fluctuations and process variability often cause major quality problems. Focusing only on costs and performances is not enough. Sensitivity to deterioration and process imperfections is an important issue. It is often not possible to completely eliminate variations due to uncontrollable factors (such as temperature changes, contamination, humidity, dust etc…).

For example, the productssold by your company might go on to be used in many different environments across the world, or in many different (and unexpected) ways, and your process (even when it has been very carefully designed and fine-tuned) will tend to vary over time.

"Variability is the enemy."

The most efficient and cost effective solution to that issue is to minimize the impact of these variations on your product's performance.

Even though variations in the inputs will continue to occur, the amount of variability that will still be “transmitted” to the final output may be reduced. But the major sources of variations need to be identified in order to study the way in which this variability “propagates” into the system. The ultimate objective is to reduce the amount of variability (from the inputs) that affects your final system.

Noise en control effects on the final output

It is possible to better assess the noise (uncontrollable variations from the inputs) factor effects using an ANOVA, a Design of Experiments (DOE), or a regression analysis. Some parameters that are easily controllable (control factors) in your system may interact with these noise effects. Interacting, in this instance, means that a noise factor effect may be modified by a controllable factor. If that is the case, we can use this noise*control interaction to minimize the noise effects.

Two Approaches to Understanding and Reducing Variability

There are actually two ways in which we can reduce the amount of variability that affects your process:

Use non-linear effects: Response surface DOEs may be used to study curvatures and quadratic effects. In the graph below, you can see that factor B has a quadratic effect on the Y response.

The slope of the curve is much steeper for low values of B (B-), with a shallow slope for larger values of B (B+). The shallow part of the curve is the so-called “sweet spot.” Although the variations of B at the – or at the + level are strictly equivalent, the amount of variability that is propagated to Y is much smaller at B+ (the sweet spot). Setting B at its + level can make the process more robust to variations in the B variable.
Use Interaction effects: The next graph shows the interaction of a noise factor (B) with an easily controllable factor (C).

The slopes of the two lines represent the linear effect of the noise factor (B), and the difference between the two lines represents the controllable factor effect (C). Please note that the B (Noise) effect is much smaller when C (the controllable factor) is set at its – level (C-). Therefore one can use the C factor to minimize the B (noise) effect, thereby making the process more robust to the fluctuations of B.

A Manufacturing Example

The objective in this example is to improve the robustness of an automotive part (a recliner) to load changes.

The noise factor is the load. Two levels have been tested: no load (Noise -) or large load (Noise +)). There also are three easily controllable factors: Type of grease, Type of Bearing, and Spring Force), each with two levels. The design is a full 23 DOE which has been replicated twice (8*2=16 runs for every combination of the noise factor level).

The response is the acceleration signal. The runs have been performed with no load (Noise at level -) and with a large load (Noise at level +). The Noise effect is the difference between those two runs (Noise Effect = (“Noise +”) – (“Noise –“)) .

The mean effect of the noise factor is 1.9275, and the goal is to minimize this noise effect. It's also important to minimize the amount of acceleration.

The analysis of the experimental design has been performed using Minitab Statistical Software. Two responses have been studied : Acceleration Signal, and Noise (Load) effect, both of which are important to optimizing the system.

The Design of experiments array is pictured below. For each line of the Minitab worksheet, a noise effect has been calculated (Effect = (“Noise +”) – (“Noise –“)). An acceleration signal Mean has also been computed.

The Design of Experiments array

The following Pareto chart shows the factor effects on the acceleration signal mean. Bearing, Grease and Spring Force are all significant, two interactions (AB and AC) are also significant (above the red significance threshold).

Pareto of the effects on the mean

The Cube plot shows that a Type -1 bearing combined with a Type 1 grease lead to a low acceleration signal:

Cube plots of effects on mean

The interaction plot illustrates the effect of the Bearing * Grease interaction on the final output Mean response. When Bearing is set at its –1 level, Grease has almost no effect on the acceleration signal.

Interaction effects on the mean response

Impact of Controllable Factors on Noise Effects

The next Pareto chart is used to study the controllable factor effects on the Noise effects, when noise effect represents the final output response. A, B and the AB interaction are all significant (above the red significance threshold).

Pareto of the effects on the noise output

The interaction plot illustrates the Bearing*Grease interaction effect on the noise effect response : a Type -1 Bearing leads to much smaller noise effects, and when Bearing is set at its –1 level, Grease has almost no impact on the Noise effect.

Control factors interaction on the noise effect

Conclusion

This DOE analysis shows that selecting a Type -1 Bearing can substantially reduce the acceleration signal and the noise effect. Therefore the system is now much more robust to loading changes. This was the final objective. I decided to study the effect of the controllable factors on the noise effects directly rather than use more complex responses (such as Taguchi’s Signal To Noise ratio), as I thought it would be easier to understand and more explicit.

I hope this example illustrates how Design of Experiments can be a powerful tool to make processes and products less sensitive to variations in their environment.

Heat fan Harry Truman mocks my prediction. Let me be direct. Lask week I wrote that “while a game 4 victory gives the Heat renewed hope, history is still on the side of Tony Parker, Time Duncan, and Manu Ginobili in their quest to win a fourth championship together.”

The implication is clear. I thought that because the Spurs had been the first team to get to two victories in the 2013 NBA finals, they were going to be the first team to get to 4 wins.

As any competent biostatistician with a British Science Association media fellowship can tell you, “correlation is not causation.” I promise it was just a momentary lapse.

Here's a list of things I don't routinely do:

My one consolation is that it was really easy to say that because game 3 winners have been successful in the past, the same would hold true in the 2013 NBA finals. A comprehensive analysis of the NBA finals after game 3 would take into account position-by-position matchups, average scoring, home-court advantages, and even variables that are barely measurable, such as coaching ability and Tony Parker’s health.

When there are so many factors to consider, the temptation to rely on a crutch to explain the randomness in the world is close to irresistible. I think that's the same reason it's so tempting to sell stock when an original AFL team wins the Super Bowl. We know that there's so much randomness in the world that it's comforting to believe that the same factors that explain what will happen to the stock market somehow also affect sports champions.

So l got fooled by the NBA finals, which was disappointing. I was planning a nice blog post that would explain that correlation is not causation. I was going to talk about how even though the Spurs had won the series, it wasn’t because they had won game 3 -- it was because they had won 4 games out of 7. Otherwise, why play the games? Otherwise, how could you explain what happened to the Heat in 2011, or the Lakers in 1984, or the Sonics in 1978?

Otherwise, how could you explain what happened to the Spurs in 2013?

Bill Belichick in a rare, non-hoodied moment. by Bob Yoon, guest blogger

As a longtime Boston sports fan, these past 12 years have spoiled me for the rest of my life—seven titles amongst all four of the major sports teams and over 30 playoff berths. This era of dominance began with the New England Patriots, and the one man at the center of the team’s ascent to greatness is Coach Bill Belichick. Over the years, his choice in wardrobe has garnered just as much attention as his mastery of football strategy and tactics. His grey hoodie with the cutoff sleeves has become synonymous with his savvy, if eccentric, football acumen.

Recently, with the offseason lull between minicamp and training camp looming, a noted Patriots blogger named Mike Dussault (@PatsPropaganda) wondered aloud via Twitter:

Suddenly, visions of hypothesis testing via Minitab Statistical Software leapt into my mind, and I responded to Mike immediately after seeing his tweet. We began exchanging emails. Mike and I put together a data collection strategy through a shared Google spreadsheet. I asked for the following columns:

Date
Opponent
What BB wore
Points scored
Points allowed
Anything else Mike wanted to add

Mike then spent a considerable amount of time trying to figure out what the coach wore for each game. He scoured through archival footage of games, press conferences and even crowdsourced on Twitter. In the end, he had an answer for 160 out of about 180 games. The spreadsheet can be found here. As you can see, the spreadsheet was not in a Minitab-friendly format:

google spreadsheet

Also, I decided to find a primary metric that was continuous in order to keep things as simple as possible for the analysis. I determined that “point difference” ( Patriots score minus the opponent’s score) would be the best metric for this. Ultimately, my first Minitab worksheet looked like this:

Minitab worksheet

I cannot think of another active coach in all of sports whose wardrobe choices are discussed as much as Belichick’s. Some natural hypotheses came to mind immediately. Some of the most notable losses in the Patriots’ history came when Belichick wore something other than his grey hoodie.

Before the start of Super Bowl XLII, many of us gasped audibly when we saw the coach come out in a red hoodie. (See it here.)

The rest of that game is history – and one of the greatest upsets in professional sports. Mike and I had this in the back of our minds when I embarked on analyzing this data.

First, let’s look at the iconic grey hoodie. According to our data, Coach Belichick first donned the grey hoodie in 2003, the season when he won his second Super Bowl. He began cutting off the sleeves in 2005. In 2012, the NFL switched from Reebok to Nike, so Belichick had to start wearing a new Nike hoodie. Are there any differences associated with these changes?

I first looked at the Nike grey hoodie versus the Reebok grey hoodie. There weren’t enough samples of the Nike hoodie for me to use a Two-Sample T-Test, so I used a Two Proportions Test based on wins:

Test for Two Proportions

When I looked at the grey hoodie with sleeves vs. without, I was able to run both a Two-Sample T and a Two Proportions test:

Two-sample T-test

Two Proportions Test

Since all three hypothesis tests failed to reject the null hypothesis, we could not conclude that there was a difference in Patriots wins for Nike vs. Reebok grey hoodies, or for sleeved vs. sleeveless grey hoodies.

The connection between wins and Belichick’s sartorial choices had to depend on something other than hoodie manufacturer or the presence of sleeves. So I started a new worksheet, combined all those factors under “grey hoodie,” and prepared for my next analysis...which I'll share tomorrow.

About the Guest Blogger:

Bob Yoon is a certified Lean Six Sigma Black Belt and has been a continuous improvement engineer at Kraft Foods in Champaign, Ill. since 2011. He is committed to promoting and leading continuous improvement in his work and is a dedicated Boston sports fan. Bob can be reached on Twitter at @patscognoscente.

Would you like to publish a guest post on the Minitab Blog? Contact publicrelations@minitab.com.

Photo by Keith Allison, used under a creative commons 2.0 license.

Bill Belichick in a blue, sleeveless hoodie. by Bob Yoon, guest blogger

Yesterday's post shared how an analysis of Bill Belichick's hoodie-wearing patterns found no statistically significant difference in New England Patriots wins if he wore sleeved or sleeveless hoodies, nor if the hoodie were from Reebok or Nike.

Since these hypothesis tests failed to reject the null hypothesis, I combined these factors under “grey hoodie” and started a new Minitab worksheet.

But when I took a look at all the different outfits Belichick wore, there were still too many variables for a good analysis. I then decided to split this category into two: Type and Color.

I ran a series of One-way ANOVA tests with Type as my X against the following Y’s: Point Difference, Patriots Points Scored, and Opponent Points Scored. The results of each analysis appear below:

One-Way ANOVA

Boxplot

One Way ANOVA

Box Plot

One Way ANOVA

Box Plot

What jumped out was how dominant the Patriots were when Coach Belichick wore a winter coat. The point difference mean was an astonishing 24.5, with their defense holding opponents to an average of only 10.143 points.

When it came to color, the results were somewhat in line with what we were expecting:

One Way ANOVA

Box Plot

Remembering back to that dreadful red hoodie, I ran some Chi-Square Tests in order to check the actual Won/Loss record against the expected Won/Loss record :

Chi-Square Analysis

There were more wins than expected when Coach Belichick wore the color gray (63-15 vs. 58.39 -19.61). When he wore red (or white, for that matter), the Patriots lost more than expected (6-5 vs. 8.23-2.77).
I presented all this to Mike through a couple of emails, careful to leave out as much statistical jargon as much as possible. He posted excerpts of my emails on his blog:

http://patspropaganda.com/post/52906505084/statistical-hoodie-analysis-from-bob-yoon

Mike then incorporated these findings in a Bleacher Report article on the history of Belichick and his hoodie:

http://bleacherreport.com/articles/1668165-cutting-off-the-sleeves-the-history-of-bill-belichick-and-his-hoodie

In all, this was a lot of fun for Mike and me. But we don’t truly believe that the grey hoodie has any mystical powers. After all, the R2 values above were all extremely low for the ANOVA’s that looked at color. The analyses by clothing type had only slightly higher R2 values.

Mike and I hypothesized that temperature might be a more critical X. Belichick’s Patriots are known as one of the best cold weather teams in history. Also, Coach Belichick rolls out his rather complex defensive playbook gradually throughout the year. These may explain the numbers we saw associated with games where the coach wore a winter jacket.

Since we had more than enough data for the article, I did not explore anything outside the scope of what the coach wore. There are so many different variables we could have considered, but again, we didn’t want the scope to creep into exploring what makes the Patriots win.

I hope you all enjoyed this application of Minitab Statistical Software on this minor obsession of Boston sports fans!

About the Guest Blogger:

Would you like to publish a guest post on the Minitab Blog? Contact publicrelations@minitab.com.

Photo by Brian J. McDermott, used under a creative commons 2.0 license.

Regression analysis generates an equation to describe the statistical relationship between one or more predictor variables and the response variable. After you use Minitab Statistical Software to fit a regression model, and verify the fit by checking the residual plots, you’ll want to interpret the results. In this post, I’ll show you how to interpret the p-values and coefficients that appear in the output for linear regression analysis.

How Do I Interpret the P-Values in Linear Regression Analysis?

The p-value for each term tests the null hypothesis that the coefficient is equal to zero (no effect). A low p-value (< 0.05) indicates that you can reject the null hypothesis. In other words, a predictor that has a low p-value is likely to be a meaningful addition to your model because changes in the predictor's value are related to changes in the response variable.

Conversely, a larger (insignificant) p-value suggests that changes in the predictor are not associated with changes in the response.

In the output below, we can see that the predictor variables of South and North are significant because both of their p-values are 0.000. However, the p-value for East (0.092) is greater than the common alpha level of 0.05, which indicates that it is not statistically significant.

Table with regression p-values

Typically, you use the coefficient p-values to determine which terms to keep in the regression model. In the model above, we should consider removing East.

How Do I Interpret the Regression Coefficients for Linear Relationships?

Regression coefficients represent the mean change in the response variable for one unit of change in the predictor variable while holding other predictors in the model constant. This statistical control that regression provides is important because it isolates the role of one variable from all of the others in the model.

The key to understanding the coefficients is to think of them as slopes, and they’re often called slope coefficients. I’ll illustrate this in the fitted line plot below, where I’ll use a person’s height to model their weight. First, Minitab’s session window output:

Coefficients table for regression analysis

The fitted line plot shows the same regression results graphically.

Fitted line plot of weight by height

The equation shows that the coefficient for height in meters is 106.5 kilograms. The coefficient indicates that for every additional meter in height you can expect weight to increase by an average of 106.5 kilograms.

The blue fitted line graphically shows the same information. If you move left or right along the x-axis by an amount that represents a one meter change in height, the fitted line rises or falls by 106.5 kilograms. However, these heights are from middle-school aged girls and range from 1.3 m to 1.7 m. The relationship is only valid within this data range, so we would not actually shift up or down the line by a full meter in this case.

If the fitted line was flat (a slope coefficient of zero), the expected value for weight would not change no matter how far up and down the line you go. So, a low p-value suggests that the slope is not zero, which in turn suggests that changes in the predictor variable are associated with changes in the response variable.

I used a fitted line plot because it really brings the math to life. However, fitted line plots can only display the results from simple regression, which is one predictor variable and the response. The concepts hold true for multiple linear regression, but I would need an extra spatial dimension for each additional predictor to plot the results. That's hard to show with today's technology!

How Do I Interpret the Regression Coefficients for Curvilinear Relationships and Interaction Terms?

In the above example, height is a linear effect; the slope is constant, which indicates that the effect is also constant along the entire fitted line. However, if your model requires polynomial or interaction terms, the interpretation is a bit less intuitive.

As a refresher, polynomial terms model curvature in the data, while interaction terms indicate that the effect of one predictor depends on the value of another predictor.

The next example uses a data set that requires a quadratic (squared) term to model the curvature. In the output below, we see that the p-values for both the linear and quadratic terms are significant.

Coefficients table for a regression model with a quadratic term

The residual plots (not shown) indicate a good fit, so we can proceed with the interpretation. But, how do we interpret these coefficients? It really helps to graph it in a fitted line plot.

Fitted line plot with a quadratic predictor

You can see how the relationship between the machine setting and energy consumption varies depending on where you start on the fitted line. For example, if you start at a machine setting of 12 and increase the setting by 1, you’d expect energy consumption to decrease. However, if you start at 25, an increase of 1 should increase energy consumption. And if you’re around 20, energy consumption shouldn’t change much at all.

A significant polynomial term can make the interpretation less intuitive because the effect of changing the predictor varies depending on the value of that predictor. Similarly, a significant interaction term indicates that the effect of the predictor varies depending on the value of a different predictor.

Take extra care when you interpret a regression model that contains these types of terms. You can’t just look at the main effect (linear term) and understand what is happening! Unfortunately, if you are performing multiple regression analysis, you won't be able to use a fitted line plot to graphically interpret the results. This is where subject area knowledge is extra valuable!

Particularly attentive readers may have noticed that I didn’t tell you how to interpret the constant. I’ll cover that in my next post!

Be sure to:

Latest Images