Minitab | Minitab

S, the standard error of the regression R-squared gets all of the attention when it comes to determining how well a linear model fits the data. However, I've stated previously that R-squared is overrated. Is there a different goodness-of-fit statistic that can be more helpful? You bet!

Today, I’ll highlight a sorely underappreciated regression statistic: S, or the standard error of the regression. S provides important information that R-squared does not.

What is the Standard Error of the Regression (S)?

S becomes smaller when the data points are closer to the line.

In the regression output for Minitab statistical software, you can find S in the Summary of Model section, right next to R-squared. Both statistics provide an overall measure of how well the model fits the data. S is known both as the standard error of the regression and as the standard error of the estimate.

S represents the average distance that the observed values fall from the regression line. Conveniently, it tells you how wrong the regression model is on average using the units of the response variable. Smaller values are better because it indicates that the observations are closer to the fitted line.

fitted line plot of BMI and body fat percentage

The fitted line plot shown above is from my post where I use BMI to predict body fat percentage. S is 3.53399, which tells us that the average distance of the data points from the fitted line is about 3.5% body fat.

Unlike R-squared, you can use the standard error of the regression to assess the precision of the predictions. Approximately 95% of the observations should fall within plus/minus 2*standard error of the regression from the regression line, which is also a quick approximation of a 95% prediction interval.

For the BMI example, about 95% of the observations should fall within plus/minus 7% of the fitted line, which is a close match for the prediction interval.

Why I Like the Standard Error of the Regression (S)

In many cases, I prefer the standard error of the regression over R-squared. I love the practical, intuitiveness of using the natural units of the response variable. And, if I need precise predictions, I can quickly check S to assess the precision.

Conversely, the unit-less R-squared doesn’t provide an intuitive feel for how close the predicted values are to the observed values. Further, as I detailed here, R-squared is relevant mainly when you need precise predictions. However, you can’t use R-squared to assess the precision, which ultimately leaves it unhelpful.

To illustrate this, let’s go back to the BMI example. The regression model produces an R-squared of 76.1% and S is 3.53399% body fat. Suppose our requirement is that the predictions must be within +/- 5% of the actual value.

Is the R-squared high enough to achieve this level of precision? There’s no way of knowing. However, S must be <= 2.5 to produce a sufficiently narrow 95% prediction interval. At a glance, we can see that our model needs to be more precise. Thanks S!

Read more about how to obtain and use prediction intervals as well as my regression tutorial.

Airmen from the 18th Civil Engineer Squadron Fire Department conduct training at the Silver Flag area of Kadena Air Base, Japan, Dec. 2 during Beverly High 09-01, a local operational readiness exercise to test the readiness of Kadena's 18th Wing Airmen. (U.S. Air Force photo/Tech. Sgt. Rey Ramon) The January/February issue of Men’s Health includes an article by Michael Perry with photographs by Eric Ogden titled “Voices from the Flames.” The article contains a lot of statistics that I didn’t know about fires in contemporary America. As a statistician, I like articles with statistics. While this article included a satisfying number of statistics, graphs that would make them easy to understand were absent. So in the interests of communicating the importance of fire safety, I thought I’d take a minute to make some graphs myself, inspired by some of the statistics that Perry uses. Communicating the meaning of statistics is one of the powerful purposes of Minitab.

Gender of Americans Who Die in Residential Fires

Perry reminds his target readers, presumably mostly men, that “of the nearly 2,500 Americans who die in residential fires every year, 57% are male.” That statistic is calculated from the National Fire Incident Reporting System database and can be found in this summary report about civilian fire fatalities. Of course, that information isn’t very meaningful without a little bit more context. If 57% of Americans are male, then it wouldn’t be surprising that 57% of civilian fire fatalities are male. However, according to the U.S. Census Bureau’s Quick Facts in 2012 the bureau estimated that 50.8% of Americans are female. Here’s what that looks like:

Gener in American population and in civilian fire deaths in America

See how the bars are close in height for the population, but not for civilian fire deaths? That suggests that there’s a relationship between gender that doesn’t bode well for those of us with Y chromosomes.

Deaths and Fire Sources

Perry’s article reports that “Candle fires kill 166 people a year,” which is not an insignificant amount when you consider that that 166 is more than 1 in 20 of the approximately 2,500 Americans who die in fires each year. But it’s really interesting when you put that figure in the context of some more data about fire sources and deaths. Here’s what that looks like:

Some sources of fires that lead to fatalities

Candles are certainly a cause for concern. No one should play with candles or leave them unattended. But in terms of risk, smoking seems like it deserves greater consideration. While smoking leads to very few fires relative to other causes, those fires are among the most likely to lead to a fatality. (See links at the end of the article for the data I graphed.)

Get Out Fast

Perry’s article points out how much shorter the time people have to escape a residential fire has become. “’In 1970, your average escape time was around 17 minutes,’ says Larry McKenna, a USFA researcher. ‘Today, it’s as little as three minutes.’” By themselves, those numbers sound dire, but put them on a graph and you get a striking picture of the difference.

Time to escape has changed from 17 minutes to 3 minutes.

What’s flashover, you might ask? I’m sure there’s a technical definition, but if you like visuals then you can’t do better than to watch the video from Underwriter’s Laboratory that gives the times that I recorded in the above graph.

It never would have occurred to me that fire safety wouldn’t be a consideration in making modern furniture, but apparently fire safety’s fallen by the wayside in favor of other considerations.

Summary

It’s easy for me to behave as if fires can only happen in other people’s homes, and that only careless people are in real danger from fires. Then along comes an article in popular media that exposes my ignorance about what fires are really like in the contemporary America.

Let’s applaud Michael Perry for his use of statistics to draw attention to fire safety in his article. Even if you don’t read Perry’s article, I hope the Minitab graphs above will still inspire you to spend a few minutes with some of the reputable organizations on the Internet that promote fire safety:

Ready.gov

U.S. Fire Administration

National Fire Protection Association

Bonus

Prefer some less grim Minitab graphs? Check out some graphing tips that celebrate Valentine's Day with Minitab.

Data sources for my second graph:

heating

smoking

electrical malfunction

cooking

candles

The image of the airmen from the 18th Civil Engineer Squadron Fire Department training is by Tech. Sgt. Rey Ramon and is in the public domain.

When looking at a control chart, it’s important to know that the data we are looking at is accurate. Let’s face it, if the control limits we are looking at don’t really reflect what’s actually happening in our process, what does it matter if our points fall within the limits, or a little bit outside?

Let’s take a trip down to the widget factory, where widgets are being produced in all shapes and sizes. We’re going to take a look at one particular widget, and the time it takes for that particular widget to be produced (in minutes).

In a Minitab Statistical Software datasheet, we have two months (March and April) of observations in a column, let's say C2. We also have a date column, indicating the date on which the widget was produced. This information is contained in our first column, C1.

Let’s quickly create an individuals control chart in Minitab by going to Stat > Control Charts> Variable Charts for Individuals> Individuals. Here is what we get:

ChartNoStage

At first glance, it looks like we have some work to do, especially in the month of April. There seems to be a number of out of control points that warrant further investigation. In fact, we have five points alone that fall above the upper control limit, some by a considerable distance.

However, from our process knowledge, we know that on April 1 we made a process change that, while increasing the time spent to create widgets, greatly improved their quality. Using data from before the process change could throw off our estimates of the process afterwards. Therefore, it doesn’t really make sense to look at these data together.

We could create two different control charts, however, we are interested in seeing how our data look before AND after the process change was implemented. This is a perfect place to create what are known as stages. Essentially, we can get Minitab to estimate control limits differently for data points before and after the process change. In the I Chart menu, we can click ‘I Chart Options’ and navigate to the Stages tab.

This is how I filled out the dialog:

IChartDialog

Using my Date column, I told Minitab to create a new stage at the specific values that I indicate. In our case, since I want to start a new stage, I just type "April 1" within quotes.

We now get the following control chart:

chartwithstage

You'll notice the dashed blue line indicating the stage break. You can also see that Minitab has estimated a separate center line and control limits for each stage.

Now that we take into account that we have a known process change occurring on April 1, none of our data points are out of control! The five points we originally thought needed further investigation are actually in control, once we account for the process change that occurred. Because of this, it makes sense to estimate these data separately, and that’s what staging allowed us to do.

It has come to my attention recently that amidst the fun of attending Super Bowl parties and watching the 2nd-most viewed sporting event on earth there are some people—seedy characters with questionable pasts, I'm sure—who are betting on the game!

Now, as gambling on sporting events is highly regulated and illegal in almost every state, I'm confident that reports of this are overblown and that the fine, upstanding readers of this blog are not among those taking part in such an activity.

But if you happen to live in an area where such things are legal and you choose to participate, then you might be familiar with the kind of wager I want to write about today—the "Grid Pool." Now the grid pool looks something like this before any entries are taken:

Blank Grid

Each participant pays a set amount for each square they would like, and once the grid is filled the two teams in the Super Bowl are randomly drawn to represent each axis, and the numbers 0-9 are randomly drawn to fill each row and column. A completed bracket (in this case, of course, completely fictional) looks something like this:

Completed Grid

Because the teams and numbers are not entered until after the bracket is filled, participants have no control over which combinations of teams and numbers they get.

There are variations that use the score at the end of each quarter, but for simplicity we're going to use the jackpot—the score at the end of the game. The last digit on each team's score is used to find the pool winner...so in the above (fictional, I repeat, fictional) example, "Viv" would have won the pool as the final score was Ravens 34, 49ers 31, and "Viv" has the block corresponding to Ravens 4, 49ers 1:

Winning Grid

So, Nevada residents who have entered into one of these pools in a registered casino and intend on paying taxes on your winnings and complying with all other local, state, and federal laws, what combination of numbers should you be hoping for?

First off, since the teams are selected randomly there is no odds difference (initially) between squares that correspond to the combination of 1-4 and 4-1. So I'm just going to "fold" the grid along the diagonal for simplicity. I've pulled the results of every NFL game over the past five years—a total of 1,334 games—and compiled the number of times each combination has occurred. A cross-tabulation done in Minitab Statistical Software yields this output:

Tabulated Results

As an example, the combination 1-0 (which also includes 0-1) has occurred 28 times over the past five years. Now tabulated results aren't the easiest to interpret so I'll show things graphically below, but you can tell (and statistics prove) that combinations occur with very different frequency—for example, 2-2 has only occurred once, while 3-0 has occurred 97 times. If you happen to draw the 2-2 square, I hope you at least enjoy watching the game and commercials.

Let's use a Pareto Chart to look at the most common combinations, grouping the least common 20% into "Other":

Pareto Chart of Combination

As you can see, there are three clear leaders: drawing the 3-0, 7-4, or 7-0 squares (or their 0-3, 4-7, or 0-7 counterparts) is highly desirable. Although these six squares account for only 6% of the total number of squares, they account for nearly 22% of final scores!

Now the analysis above assumes that you don't know the winner with much certainty (sure, there's a favorite, but only really heavy favorites are almost certain to win), so which team gets drawn on which axis is irrelevant. But there are two situations where you may not want to employ "folding" the grid for odds:

One team is an overwhelming favorite (like the Broncos playing the Jaguars).
The game is already underway and one team is obviously going to easily defeat the other, but it's too early in the game to have any reasonable idea of where final scores might end up.

In these cases, rather than folding we're going to look at the combination of the winner's score and loser's score. Here our graph looks something like this, with winner's score listed first:

Pareto Chart of WL Combination

Now we see one combination that is clearly superior: having the square corresponding to 3 for the obvious winner and 0 for the obvious loser is the best. To be sure, a combination like Winner 0 - Loser 7 is still much better than Winner 5 - Loser 4. But if I could pick—hypothetically speaking, of course—I'd take 3-0, just 1% of the grid accounting for 5% of the results!

Super Bowl Sunday is right around the corner! But instead of trying to break down and predict the outcome of the game (which will likely come down to turnovers, which are impossible to predict), I’m going to look at some different statistics from previous Super Bowls. How many close games have there been? How much has the price for a 30-second add increased over the years? Which state has hosted the Super Bowl the most times? I’ll look at them all!

The Actual Game

First, I’m going to look at different things that have happened in the game. Five of the last 6 Super Bowls have been decided by less than a touchdown. But in general, are most Super Bowls close games?

Pie Chart

It looks like we’ve been spoiled the last few years. Just under 63% of Super Bowls have had the margin in the final score be greater than a touchdown. And just under 15% have been within a field goal.

Overall, the NFC has won 5 more Super Bowls than the AFC. We can use our close game groups to see if they are lucky to have this advantage (won more close games, which are pretty random), or if they earned it (more blowout wins).

Cross Tabulation

For games decided by two touchdowns or less, the AFC and NFC have won pretty much the same amount of games. But the NFC has almost twice as many blowout wins. So the advantage held by the NFC in the Super Bowl is hardly because of luck. They’ve earned it by blowing out the AFC team almost twice as often as the AFC team blows them out.

We can use a bar chart to determine whether there was a certain time period when the NFC was dominant.

Bar Chart

From 1984 to 1996, the NFC won 13 straight Super Bowls, with 8 of them coming by more than 2 touchdowns. It was the era dominated by Joe Montana, Jerry Rice, Troy Aikman, and Emmitt Smith. Clearly the AFC had no answer for these NFC stars.

Vegas!

It wouldn’t be Super Bowl weekend if you didn’t hear about all the ridiculous prop bets in Las Vegas. It’s gotten so big in recent years, that not only can I bet on the length of the national anthem, I was easily able to find the length of every Super Bowl national anthem since Super Bowl XL! (that's Super Bowl 40, for the roman numeral impaired)

But I’m going to stick to the most well-known gambling bet, the spread. I found the spread for every Super Bowl, so let’s see who has covered more often.

Tally

The favorite has covered the Vegas spread more often, winning 7 more times than the underdog. However, the underdog may still yet get the last laugh. In the 19 games where they’ve covered the spread, they’ve won 13 of those games outright! And the average spread in those 13 games was 7.7 points. So if you’re in Vegas and thinking of taking the Seahawks on Sunday, it might be best to just bet on them to win instead of taking the points.

Now for the most interesting statistic, let’s look at every double digit Super Bowl underdog. There have been a total of 14 of them. Here is the summary of what happened in those games.

Tally

Half of the time, the favorite has covered the spread. That’s exactly what we would expect. But look at the underdogs. They’ve covered 6 times, but 5 of them have won the game outright! (Only the Pittsburgh Steelers in Super Bowl XXX covered the spread but lost the game.) And 3 of those games are the most famous Super Bowls ever:

Joe Namath guarantees victory, then backs up his words as the New York Jets upset the Baltimore Colts, who were 18 point favorites.
Little-known quarterback Tom Brady leads the New England Patriots to a shocking upset of the St. Louis “Greatest Show on Turf” Rams, who were 14-point favorites.
Now very well known, Tom Brady and the Patriots are undefeated and 12-point Super Bowl favorites, but are unable to finish the season perfect as they fall to the New York Giants and David Tyree's helmet.

So remember, next time there is a double-digit favorite, it may be in your best interest to take the underdog to win!

Television

Watching the commercials has become almost as big a part of Super Bowl Sunday as the actual game. In fact, if you took a poll of random Americans asking them what they are most excited for on Super Bowl Sunday, I think your results would look like this.

The Commercials
The Puppy Bowl
The Actual Game

Non-scientific polls aside, let’s see which TV Network has televised the most Super Bowls.

Pie Chart

CBS and NBC hold a big advantage over ABC and FOX. In fact, did you know that NBC and CBS both televised Super Bowl I? It’s true! Although CBS must have had more people watch them, because they charged $5,000 more for a 30-second commercial than NBC did. Speaking of which, let’s look at how the price of a 30-second add has exploded over the last 48 years.

Time Series Plot

It took 28 years for the price of a 30 second advertisement to reach $1,000,000. Since then it’s increased by another $3,000,000 in 19 years! If the price keeps climbing at this rate, what will the price be in 2020. What about 2030!? We can use Minitab’s trend analysis to find out.

Trend Analysis

Forecasts

Year Forecast

2015 $3,991,981

2016 $4,176,657

2017 $4,365,570

2018 $4,558,719

2019 $4,756,105

2020 $4,957,728

2021 $5,163,587

2022 $5,373,683

2023 $5,588,015

2024 $5,806,584

2025 $6,029,390

2026 $6,256,432

2027 $6,487,711

2028 $6,723,226

2029 $6,962,978

2030 $7,206,967

At the current rate, by the year 2020, a 30-second commercial will cost 5 million dollars, with that increasing by 7.2 million dollars by 2030. That may sound expensive, but considering the Super Bowl is the only television broadcast in the world where people actually want to watch the commercials, it may very well be worth it!

Location! Location! Location!

This hasn’t really been covered much in the last two weeks, but did you know that the Super Bowl is in New Jersey/New York this year? And that it’s cold there? Oh the humanity! So why are media types complaining so much about the cold weather? Well, this could be the reason:

Pie Chart

Florida, California, and Louisiana dominate the locations at which the Super Bowl was played. Lots of sunshine in those states. And yes, there have been some Super Bowls in cold-weather states, such as Minnesota and Michigan. But the biggest difference is that those games were played in domes. This will be the first Super Bowl played outside in a cold-weather city.

I wrote earlier about how a big part of the reason Peyton Manning’s cold weather statistics look bad (especially when compared to Tom Brady) is because most of Manning’s cold-weather games have on the road while most of Brady’s are at home. But this is uncharted territory for any quarterback. A cold-weather game at a neutral site! I don't think that has ever happened before in the history of the NFL. It’s going to be fascinating to see how it plays out.

And that’s all the Super Bowl statistics I have. Hopefully these fun facts will give you something to impress your friends and family with during the game Sunday.

Just make sure you’re not talking during the commercials!

Using data analysis and statistics to improve business quality has a long history. But it often seems like most of that history involves huge operations. After all, Six Sigma originated with Motorola, and became adopted by thousands of other businesses after it was adopted by a little-known outfit called General Electric.

There are many case studies and examples of how big companies used Six Sigma methods to save millions of dollars, slash expenses, and improve quality...but when they read about the big dogs getting those kind of results, a lot of folks hear a little voice in their heads saying, "Sure, could it work in my small business?"

Can Six Sigma Help a Small Business?

six sigma for bicycle chain manufacturer That's why I was so intrigued to find this article published in the TQM Journal in 2012: it shows exactly how Six Sigma methods can be used to benefit a small manufacturing business. The authors of this paper profile a small manufacturing company in India that was plagued with declining productivity. This operation made bicycle chains using plates, pins, bushings, and rollers.

The bushings, which need to be between 5.23 and 5.27 mm, had a very high rejection rate. Variation in the diameter caused rejection rates of 8 percent, so the company applied Six Sigma methods to reduce defects in the bushing manufacturing process.

The company used the DMAIC methodology--which divides a project into Define, Measure, Analyze, Improve, and Control phases--to attack the problem. Each step the authors describe in their process can be performed using Minitab Statistical Software and Quality Companion, our collection of "soft tools" for quality projects.

The Define Phase

The Define phase is self-explanatory: you investigate and specify the problem, and detail the requirements that are not being met. In Define phase, the project team created a process map (reproduced below in Quality Companion) and a SIPOC (Supplier, Input, Process, Output, Customer) diagram for the bushing manufacturing process.

Process Map Created in Quality Companion by Minitab

The Measure Phase

In measure phase, you gather data about the process. This isn't always as straightforward as it seems, though. First, you need to make sure you can trust your data by conducting a measurement system analysis.

The team in this case study did Gage repeatability and reproducibility (Gage R&R) studies to confirm that their measurement system produced accurate and reliable data. This is a critical step, but it needn't be long and involved: the chain manufacturer's study involved two operators, who took two readings apiece on 10 sample bushings with a micrometer. The 40 data points they generated were sufficient to confirm the micrometer's accuracy and consistency, so they moved on to gathering data about the chain-making process itself.

The Analysis Phase

The team then applied a variety of data analysis tools, using Minitab Statistical Software. First they conducted a process capability analysis, taking 20 samples produced under similar circumstances (in groups of 5). The graph shown below uses simulated data with extremely similar, though not completely identical, results to those shown in the TQM Journal article.

process capability curve

One of the key items to look at here is the PPM Total, which equates to the commonly-heard DPMO, or defects per million opportunities. In this case, the DPMO is nearly 80,000 per million, or 8 percent.

Another measure of process capability is the the Z.bench score, which is a report of the process's sigma capability. In general terms, a 6 sigma process is one that has 3.4 defects per million opportunities. Adding the conventional 1.5 Z-shift, this appears to be about a 3-sigma process, or a little over 66,000 defects per million opportunities.

Clearly, there's a lot of room for improvement, and this preliminary analysis gives the team a measure against which to assess improvements they make to the process.

At this point, the project team looked carefully at the process to identify possible causes for rejecting bushings. They drew a fishbone diagram that helped them identify four potential factors to analyze: whether the operator was skilled or unskilled, how long rods were used (15 or 25 hours), how frequently the curl tool was reground (after 20 or 30 hours), and whether the rod-holding mechanism was new or old.

The team then used Minitab Statistical Software to do 2-sample t-tests on each of these factors. For each factor they studied, they collected 50 samples under each condition. For instance, they looked at 50 bushings made by skilled operators, and 50 made by unskilled operators. They also looked at 50 bushings made with rods that were replaced after 15 hours, and 50 made with rods replaced after 30 hours.

The t-tests revealed whether or not there was a statistically significant difference between the two conditions for each factor; if no significant difference existed, team members could conclude it didn't have a large impact on bush rejection.

This team's hypothesis tests indicated that operator skill level and curl-tool regrinding did not have a significant effect on bushing rejection; however, 15-hour vs. 25-hour rod replacement and new vs. old rod-holding mechanisms did. Thus, a fairly simple analysis helped them identify which factors they should their improvement efforts on.

In my next post, I'll review how the team used Minitab to apply what they learned in the Define, Measure and Analyze phases of their project to the final two phases: Improve, and Control, and the benefits they saw from the project.

Did you know about the Minitab Network group on LinkedIn? It’s the one managed by Eston Martz, of Minitab blog fame. I like to see what the members are talking about, which recently got me into some discussions about Raman spectroscopy data.

An incredibly fine 5-carat emerald crystal, that has it all: bright grass-green color, glassy luster, a fine termination, and most of all, TOP gemminess. Not having much experience with Raman spectroscopy data, I thought I’d learn more about it and found the RRUFFTM Project.

The idea is that if you have a Raman device, you can analyze a mineral sample and compare your results to information in the database so that you can identify your mineral. Not having a Raman device, the site is still exciting to me because all of the RRUFFTM data are available in ZIP files that you can download and use to illustrate some neat things in Minitab.

So let’s say that you download one of the ZIP files from the RRUFFTM Project. The ZIP file contains a few thousand text files with intensity data for different minerals. Some minerals have a small number of files. Some minerals, like beryl, have many files.

Turns out beryl’s pretty cool. In its pure form, it’s colorless, but it comes in a variety of colors. In the presence of different ions, beryl can be aquamarine, maxixe, goshenite, heliodor, and emerald.

I extracted just the beryl files into a folder on my computer. Now, I want to analyze the files in Minitab. If I open the worksheet in Minitab without any adjustments, I get something like this:

This worksheet puts sample identification information with the measurements, so you can't analyze the data.

While I could certainly rearrange this with formulas, I need only a few steps to open the file ready to analyze.

Choose File > Open Worksheet.
Select the text file.
Click Preview.
Scroll down so that you can see the first row of numbers, in this case, row 13.
Click OK to close the preview.
Click Options.
In First Row of Data, select Use row. Enter the row that had the data in the preview. In this case, 13.

Now you’ve solved the problem of including identifying information about the mineral in the worksheet. The other problem is that Minitab places all of the data in a single column unless you tell it how to divide the data. If you click OK in Options and click Preview again, you can see the problem. Finish the Options with these steps:

In Single character separator, select Comma.
Click OK twice.

Now your data is in a nice, analyzable format. But remember that there are more than 30 files with data on beryl. To analyze them together in Minitab, the data need to be in same worksheet.

Minitab has two nice features that make this easy to do.

If the file organization stays the same, then you don’t have to respecify the first row of data or the single character separator. Minitab remembers the settings.
Minitab gives you the option to merge the worksheet that you’re opening into the active worksheet.

Try these steps:

Choose File > Open Worksheet. (Minitab even remembers that you want to open a text file.)
Select the next text file.
Select Merge.
Click OK.

The new data appears in the worksheet with the first data. You can finish this with the rest of the files to get all of the data in a single worksheet.

The new worksheet contains the data from both text files.

The options that Minitab provides for opening and merging data sources make it easy to get a wide variety of data ready for analysis. The data features are a good complement to the easy graphs and analyses that you can do in Minitab.

The image of the emerald is by Rob Lavinsky and is licensed under this Creative Commons License.

palm trees I didn’t expect that our family trip to Florida would end with me driving a plane load of passengers nearly 200 miles to their homes, but it did.

Yes, it was a long and strange journey home. A journey that started in the tropical warmth of southern Florida and ended the next morning in central Pennsylvania, which felt like the arctic wastelands thanks to the dreaded polar vortex.

During this journey, I didn’t just experience temperature extremes, but also extremely different levels in the quality of customer care. Working at Minitab, I'm very aware of the quality of service because quality improvement is our key mission!

Setting the Stage

In this story, there are some factors that the airline understandably can’t control, like the odd mechanical problem and the fact that the nation's air traffic system was backed up due to the polar vortex gripping the region in arctic conditions. However, there are other factors which the airline can control, such as its policies, training, and the flow of information through the organization.

As you’ll see, the moral of the story is that when the airline doesn’t effectively handle the controllable factors, it can make an unexpected situation even worse.

My family’s return to our home in central Pennsylvania started normally enough. Our flight from Miami to Philadelphia left a little late, but early enough to catch our connecting flight. The problems started in Philadelphia when our flight to State College was repeatedly delayed and then finally cancelled around 11 p.m. due to mechanical reasons.

The gate manager, who turned out to be the one well trained airline employee we worked with, clearly laid out the situation. Normally the airline would put us up in a hotel because the cancellation was due to a mechanical problem. However, the block of rooms that cost $100 was sold out. Further, because of numerous weather-related cancellations, the next available flight was in two days. So, she was going to arrange for a bus to drive us to State College that evening.

Great! It wasn’t exactly how we pictured getting home, but it would suffice and the reasons were logical. Unfortunately, this point is when things started to get unpleasant.

Where Things Go Wrong

airplane There were 14 of us scheduled for this commuter flight into State College, and we hung around the gate waiting to hear about our bus. The gate manager had told us to wait by the gate because we’d be leaving soon.

Instead, we waited and waited, not hearing any news at all. All of the airline personnel had quickly gone missing. One gate agent (not the previous manager) came by periodically for other reasons, but she didn’t know anything about anything. I started to think of her as Ms. Shrugs-A-Lot due to the gesture that always accompanied her non-answers. She didn’t even know if they were still trying to book a bus or not!

After much more waiting, Shrugs-A-lot came by and, after more non-answers, flagged down the point of contact (POC) manager who happened to be walking through the terminal. According to Shrugs-A-lot, the POC manager was the lady who was personally booking our bus and could answer our questions. Great!

We asked if she was still trying to secure a bus for us, or if that wasn’t going to work out. We wanted to know if it was worth sitting around in an increasingly deserted airport. She scowled as us, and left, literally without saying one single word. Even Shrugs-A-Lot was speechless!

Time to Bail?

At this point, we don’t know if the airline is truly getting a bus for us. Airline personnel are vanishing because the airport has officially closed. The airline had previously stated that it wouldn’t cover hotels for us and the earliest possibility for a seat on a flight was in two days. It felt increasingly likely that, given the lack of information and personnel, the default outcome would be that we’d spend the night in the airport and still not be able to leave for a couple of days. Very frustrating!

We eventually run across that first, knowledgeable gate manager who proposed the bus plan. Almost in tears, she admits that POC manager, who was responsible for acquiring the bus, wasn’t even communicating with her!

So, with procedures and communications failing, it seemed like a good time to bail. Looking online, I found a 15-passenger van that I could rent one-way. That was fortuitous, because there were 14 of us!

I rented the van, did some quick calculations, and announced that I would drive people back home for $40 a head. Before I knew it, my fellow passengers were thrusting cash into my hands! Apparently, that was a small price to pay in order to arrive home in 3.5 hours versus total uncertainty.

And, that’s how I ended up driving a plane load of passengers home! Our family arrived home at about 6 a.m. after dropping off the other passengers.

A Totally Different Experience

The van we rented Our experience at the car rental company was completely the opposite of the airline. On their web site, I quickly found the van that I could rent one-way. It was crucial that the information was accurate because the airport was closed and we would not be able to get back in if we wanted. The van had to be truly available, right now at 2 a.m., and for a one-way rental, which can be hard to find.

Perhaps being distrustful due to my recent experience with the airline, I didn’t reserve online but instead called the 800 number to be sure. The representative assured me that not only was the van available, but it was ready to go right now! Imagine that, the car rental website and toll-free number both had consistent information, which turned out to be entirely accurate. He even knew that the shuttle was still running even though the airport was closed.

At the car rental place, the staff was friendly, knowledgeable and communicated very well both with us and each other. And, this was at about 2 a.m., after an extra-hectic day for them!

Because it was a large van, there were special requirements and forms. The representative had them down pat and we got through them quickly and efficiently. While doing this, a second representative brought out a box of bottled water for us, and a third representative pulled the van up to the door and let it warm up. That was especially appreciated because the wind chill was down to -30F (-34C)! These nice touches, along with their jovial attitude, really made for a positive experience.

The High Quality Difference

During this travel fiasco, I saw firsthand how high-quality training and customer care made all the difference for us and the staff during a stressful time.

The airline had insufficient resources in the airport. The customer service desk had a lone agent with a huge line. At first the airline claimed that there were no hotel rooms, but later admitted there were rooms but not below the $100 limit. There seemed to be no flexibility in terms of reconciling the policy to put us in a hotel and the policy of not spending more than $100 per room. Airline personnel weren’t on the same page and communications broke down between the POC manager and the knowledgeable gate manager. Information just didn’t flow.

After our experience with the airline, we felt distrustful and abandoned. And the airline personnel were also stressed and even angry with each other! The gate manager wanted to provide good service, but the system failed her.

In the following days, the airline continued the game by trying to change the cancellation from being due to a mechanical problem to a weather delay! They also said we shouldn’t have “abandoned” our luggage in Philadelphia when we decided to drive home. At that point, the whole airport, including the baggage claim, was closed!

In the end, the airline apologized, gave us our luggage, refunded our last flight segment and gave us vouchers. That helped, but it didn’t have to be that difficult.

Compare that to the car rental company that had consistent and accurate information flowing through the system. The representatives were cheerful, knew what they had to do, and worked as a team. We were happy and they were happy.

The airline should assess and correct its quality of customer service. To do this, they can use Lean and Six Sigma's collection of tools for collecting data and clearly understanding situations. For example, tools such as FMEA and SIPOC can help the airline identify problems, risks, failure modes, causes, and factor in the voice of the customer. Conveniently, Minitab makes a process improvement software package called Quality Companion that makes these tools and many more available in a single application.

bike chain, top view In my previous post, I shared a case study of how a small bicycle-chain manufacturing company in India used the DMAIC approach to Six Sigma to reverse declining productivity.

After completing the Define, Measure, and Analysis phases, the team had identified the important factors in the bushing creation process. Armed with this knowledge, they were now ready to make some improvements.

The Improve Phase

In the Improve phase, the team applied a statistical method called Design of Experiments (DOE) to optimize the important factors they'd identified in the initial phases.

Most of us learn in school that to study the effects of a factor on a response, you hold all other factors constant and change the one you're interested in. But DOE lets you change more than a single variable at a time. This minimizes the number of experimental runs necessary to get meaningful results, so you can reach conclusions about multiple factors efficiently and cost-effectively.

DOE has a reputation for being difficult, but statistical software makes it very accessible. In Minitab, you just select Stat > DOE > Create Factorial Design..., select the number of factors you want to study, then choose from available designs based on your time and budget constraints.

In this case, the project team used Minitab to design a 2x2 experiment, one that had two levels for each of the two factors under examination. They did two replicates of the experiment, for a total of eight runs. The experimental design, and the measured diameter (the response) for each run is shown below in the data sheet below:

DOE worksheet

Once they'd collected the data, the team used Minitab to create plots of the main effects of both factors.

main effects plot for diameter

The slope of the lines on the main effects indicates how large an effect the factor has on the response: the steeper the slope, the greater the impact. The main effect plots above indicates that replacing the rod at 15 hours has a minor effect, while using a new rod-holding mechanism has a greater effect.

The team also created an interaction plot that showed how both factors worked together on the response variable:

Interaction plot for Diameter

Parallel lines on an interaction plot indicate that no interaction between factors is present. Since the lines in this plot intersect, there is an interaction. As the research team put it in their paper, this means "the change in the response mean from the low to the high level of rod replacement depends on the level of rod-holding mechanism."

These analyses enabled the team to identify the important factors in creating bushings that fit inside the required limits, and indicated where they could adjust those factors to improve the manufacturing process.

The Control Phase

Once the team's recommended improvements had been implemented, it was time to gather data about the new process and assess whether it had made a difference in the bushing rejection rate.

The team again collected 20 subgroups of 5 samples each (n=100) from bushings created using the improved process. (Once again, we have used simulated data that match the parameters of the team's actual data, so the results are extremely similar but not completely identical to those shown in the original report.) The results of the capability analysis are shown below:

Capability Analysis of New Process

The PPM -- the number of defects per million opportunities -- fell to 0.02, while the Z.Bench or Sigma capability score reached 5.52. That's a tremendous improvement over the original process's 8% rejection rate and 1.4 Z.bench score!

That's not quite the end of the story, though: the Control phase doesn't really end, because the owner of the process that's been improved needs to ensure that the improvements are sustained. To do this, the organization used X-bar R control charts to ensure that the improved process remained on track.

Six Sigma Project Results

So, did this project have a positive impact on the bottom line of this small manufacturing enterprise? You bet. Implementing the team's recommendations made this a 5.5-sigma process, and improved the Application of project recommendation brought up the sigma level to 5.46 and reduced the monthly bushing rejection rate by more than 80,000 PPM.

That worked out to a cost savings of about $120,000 per year. For a business of any size, that's a significant result.

Visit our case studies page for more examples of how different types of organizations have benefit from quality improvement projects that involve data analysis.

U.S. Hockey Team Member in 2010 In the sports world, it is generally accepted that the NHL players who participate in the Olympics (approximately 20%) put their NHL team at a disadvantage for the remainder of the season.

The NHL season does stop during the Olympic Games, but the thought is that the best NHL players leave their team to play the extra games, which will tire them out for the remainder of the NHL regular season and playoffs.

But do the data really support that conventional wisdom?

Relationship Between NHL Performance and Olympic Selection

First, I want to mention there is a relationship between the number of players selected from a team for the Olympics and the average number of points per game for that team (2 points are awarded for an NHL win, 1 point for an overtime or shootout loss, and 0 points for a regulation loss).

In general, good teams lose more players to the Olympics than bad teams.

I created the following scatterplot in Minitab Statistical Software using data from the 2010 Winter Olympic games. (If you'd like to follow along, please download the data set and our 30-day trial version of Minitab if you don't already have it.)

Team Points Per Game vs No. of Olympic Players

As you can see, there is a relationship between the team's performance and the number of players selected for the Olympics.

Do the Olympics "Tire Out" the Best NHL Players?

However, do the teams that lose more players to the Olympics have a worse post-Olympics points-per-game average?

Let's find out using Minitab. Choose Graph > Scatterplot and press the button for "With Regression."

scatterplot dialog

In the dialog box, choose "Improvement Ratio" as the Y variable, and Number of Olympic Players as the X variable. The improvement ratio in the graph is:

(Team Points Per Game After Olympics) / (Team Points Per Game Before Olympics).

Then use the Labels options to choose the column "Team" to label each data point in the scatterplot.

Scatterplot Labels Options

Press OK, and Minitab creates the scatterplot below.

Improvement Ratio vs Number of Olympic Players

As you can see, based on the 2010 data displayed in the graph, the number of players a team loses to the Olympics does not appear to have much impact on the points per game for the remainder of the regular season. The blue regression line is almost flat, and although it's not shown on this graph, the p-value for this regression is 0.885, far over the usual 0.05 cutoff for significance.

What About the Playoffs?

Let’s look beyond the regular season. Do teams that have more players participate in the Olympics perform worse in the playoffs?

Based on the 16 teams in 2010 that made the NHL playoffs, it does not appear there is much impact. I used Minitab to do a regression analysis that models the number of playoffs wins as a function of a team’s regular season points per game and the number of Olympic players. The number of Olympic players was not significant (p-value = 0.500).

regression analysis output

A scatterplot of this analysis lets us easily see the results.

Regression of Olympics vs NHL Playoffs

Admittedly, with only 16 teams, there is not much data to detect a trend.

Interestingly, since the coefficient for the number of Olympic players is positive (0.59834), these data are suggesting it might be that teams with more Olympic players do better in the playoffs.

A follow-up study could look at how individual players who participant in the Olympics perform before and after the Olympics. Stay tuned for a blog on that in the near future!

Image of U.S. goaltender Tim Thomas by RicLaf, used under Creative Commons 2.0: http://flickr.com/photo/71935277@N00/2460950100

Snowy highway Atlanta was a mess on January 28th, 2014. Thousands were trapped on the roads overnight while others managed to get to roadside stores to camp out. Thousands of students were forced to spend the night in their schools and the National Guard was called in to get them home. Many wondered how less than three inches of snow could cripple the city, particularly when Atlanta had experienced a similar storm in 2011?

This traumatic event, the recollection of recent snow storms, and now the current storm prompted some to wonder whether Atlanta has been experiencing more cold and snow than before. How unusual was that January day?

To answer this question, I'll use data from NOAA’s National Climatic Data Center. This long-term data set allows us to put the recent snowy patch into a historical perspective. You can get the NOAA data in this Minitab project file.

The data are from Atlanta's Hartsfield International Airport and range from 1930 to the present. That’s 84 winters! I’ll use Minitab’s time series plots to determine whether the temperatures and snowfall have changed over time.

Low Temperatures in Atlanta

Let's take a look at the temperatures and see how the current winter compare to previous winters. I'll look at both the average low temperature and the extreme low temperature for all months in Atlanta. The green circles represent January 2014.

Average low monthly temperatures in Atlanta

In the time series plot above, you can see that while January 2014 is on the low end of the regular distribution of temperature data, it's not an unprecedented value. In fact, you only have to go back three years to find a month that is within several degrees. The average monthly lows have not gotten lower over the decades, but have been stable over time.

The extreme low temperature time series plot below tells a slightly different story.

Historical extreme low monthly temperatures for Atlanta

While the average lows follow a consistent pattern since 1930, the extreme lows are slightly different. It appears that up to January 1985 (the purple circle), there were a number of months that had more extreme cold than any month after 1985. In other words, the monthly averages are stable, but Atlanta has avoided the most extreme cold for nearly 30 years.

Snowfall in Atlanta

Total monthly snowfall in Atlanta

This time series plot shows Atlanta's total monthly snowfall. January 2014 doesn't stand out as an usual snow month. In fact, you can see three recent months that had more snow! There have been plenty of far worse snow storms. January 1940 had three times the snowfall with 8.3 inches! There is no increasing trend in snowfall.

Why We Need Data

The weather data show how imperfect memories can be. It turns out that Atlanta's winters are no colder or snowier in the recent past than they were decades ago. In fact, 1930-1985 saw more extreme cold temperatures, while the averages stay constant. January's snowfall was not unusual.

Human memory is too imperfect to reliably assess long-term trends. Memories fade, change, and can be selectively retained or forgotten. On top of that, not everyone has experienced Atlanta weather long enough to know what’s normal.

Analyzing trustworthy data gives you the ability to avoid subjective assessment based on imperfect memories. Minitab statistical software and data provide an unbeatable combination that can answer almost any question.

When we think about jobs with a romantic edge to them, most of us probably think of professions that involve action or danger. Spies, soldiers, cops, criminals -- these are types of professions romantic leads have. Along with your occasional musician, reporter, or artist, who don't have the action but at least bring drama.

But you know who never shows up as a romantic lead? Quality improvement professionals, that's who. Can you name just one movie that features a dedicated data analyst or quality practitioner as the love interest...just one? No, you can't. Doesn't exist.

Love of Quality: The Greatest Love of All?

I guess screenwriters think statisticians and people in the quality industry have no love lives at all, but those of us who work in the sector know the passion and romance involved in optimizing a process, and the beauty inherent in a control chart free of special-cause variation.

Since it's Valentine's Day tomorrow, here's a fun little diversion that lets you share your love with a little data. Grab this data set, open it in Minitab Statistical Software, and select Graph > Scatterplot. Click the option for Simple scatterplot, and select "Passion" as your Y variable, and "Devotion" as your X variable.

Then send the resulting scatterplot to your sweetie:

Valentine's Day Scatterplot

You can probably expect to receive an e-mail or phone call from the recipient, asking you just what this data is supposed to mean.

Explain that you thought the pattern in the data was clear, but you'll send them a revised graph that draws the connections. Then send 'em a second graph of the data, which you've adjusted with Minitab's graph editing tools to connect the dots strategically:

Be Mine scatterplot

If you're already a Minitab user, you probably know that these graphs are very easy to customize, so you can tailor the graph just the way you -- or your beloved -- like it. For instance, if you know she's crazy about script fonts and the color pink, something like this might work:

pink be mine

Of course, you would do well to celebrate your love in other ways, too...flowers or dinner, for instance.

A few years ago my colleague Carly came up with this scatterplot. The data for this heart is included in the data set linked above, if you prefer this more streamlined approach:

Minitab Scatterplot

And she even threw in a time-series plot of her heartbeat -- now that's romantic!

Minitab Time Series Plot

If you've never edited Minitab graphs before, it's easy. To change the colors and fonts of your graphs, just double-click the graph attributes you'd like to edit. Clicking Custom on the various tabs lets you customize fill patterns and colors, borders and fill line colors, etc.

You can also change the default color and font styles Minitab uses in the Tools menu:

1. Select Tools > Options. Click Graphics and the + sign to see more options:

2. Click Regions, then choose the graph elements to customize.

3. Change the font used in your graph labels using Frame Elements (also under Graphics).

Happy Valentine’s Day!

rsquare If you regularly perform regression analysis, you know that R2 is a statistic used to evaluate the fit of your model. You may even know the standard definition of R2: the percentage of variation in the response that is explained by the model.

Fair enough. With Minitab Statistical Software doing all the heavy lifting to calculate your R2 values, that may be all you ever need to know.

But if you’re like me, you like to crack things open to see what’s inside. Understanding the essential nature of a statistic helps you demystify it and interpret it more accurately.

R-squared: Where Geometry Meets Statistics

So where does this mysterious R-squared value come from? To find the formula in Minitab, choose Help > Methods and Formulas. Click General statistics > Regression > Regression > R-sq.

rsqare

Some spooky, wacky-looking symbols in there. Statisticians use those to make your knees knock together.

But all the formula really says is: “R-squared is a bunch of squares added together, divided by another bunch of squares added together, subtracted from 1.“

rsquare annotation

What bunch of squares, you ask?

square dance guys

No, not them.

SS Total: Total Sum of Squares

First consider the "bunch of squares" on the bottom of the fraction. Suppose your data is shown on the scatterplot below:

original data

(Only 4 data values are shown to keep the example simple. Hopefully you have more data than this for your actual regression analysis! )

Now suppose you add a line to show the mean (average) of all your data points:

scatterplot with line

The line y = mean of Y is sometimes referred to the “trivial model” because it doesn’t contain any predictor (X) variables, just a constant. How well would this line model your data points?

One way to quantify this is to measure the vertical distance from the line to each data point. That tells you how much the line “misses” each data point. This distance can be used to construct the sides of a square on each data point.

pinksquares

If you add up the pink areas of all those squares for all your data points you get the total sum of squares (SS Total), the bottom of the fraction.

SS Total

SS Error: Error Sum of Squares

Now consider the model you obtain using regression analysis.

regression model

Again, quantify the "errors" of this model by measuring the vertical distance of each data value from the regression line and squaring it.

ss error graph

If you add the green areas of theses squares you get the SS Error, the top of the fraction.

ss error formula

So R2 basically just compares the errors of your regression model to the errors you’d have if you just used the mean of Y to model your data.

R-Squared for Visual Thinkers

rsquare final

The smaller the errors in your regression model (the green squares) in relation to the errors in the model based on only the mean (pink squares), the closer the fraction is to 0, and the closer R2 is to 1 (100%).

That’s the case shown here. The green squares are much smaller than the pink squares. So the R2 for the regression line is 91.4%.

But if the errors in your reqression model are about the same size as the errors in the trivial model that uses only the mean, the areas of the pink squares and the green squares will be similar, making the fraction close to 1, and the R2 close to 0.

That means that your model, isn't producing a "tight fit" for your data, generally speaking. You’re getting about the same size errors you’d get if you simply used the mean to describe all your data points!

R-squared in Practice

Now you know exactly what R2 is. People have different opinions about how critical the R-squared value is in regression analysis. My view? No single statistic ever tells the whole story about your data. But that doesn't invalidate the statistic. It's always a good idea to evaluate your data using a variety of statistics. Then interpret the composite results based on the context and objectives of your specific application. If you understand how a statistic is actually calculated, you'll better understand its strengths and limitations.

Related link

Want to see how another commonly used analysis, the t-test, really works? Read this post to learn how the t-test measures the "signal" to the "noise" in your data.

Today our company is introducing Minitab 17 Statistical Software, the newest version of the leading software used for quality improvement and statistics education.

So, why should you care? Because important people in your life -- your co-workers, your students, your kids, your boss, maybe even you -- are afraid to analyze data.

There's no shame in that. In fact, there are pretty good reasons for people to feel some trepidation (or even outright panic) at the prospect of making sense of a set of data.

I know how it feels to be intimidated by statistics. Not long ago, I would do almost anything to avoid analyzing data. I wanted to know what the data said -- I just didn't believe I was capable of analyzing it myself.

So to celebrate the release of our new software, I'm going to share my three top fears about analyzing data. And I'll talk about how Minitab 17 can help people who are struggling with dataphobia.

Fear #3: I Don't Even Know Where to Start Analyzing this Data.

Writers confront a lurking terror each time they touch the keyboard. It's called "The Blank Page," or maybe "The Blank Screen," and it can be summed up in a simple question: "Where do I start?" I know that terror well...but at least when confronting the blank page, I always had confidence that I can write.

When it came to analyzing data, not only was I not sure where to start, I also had no confidence that I'd be able to do it. I always envisioned getting off on the wrong foot with my analysis, then promptly stumbling straight off some statistical cliff to plunge into an abyss of meaningless numbers.

You can understand why I tried to avoid this.

We want to help people overcome those kinds of qualms. Minitab 17 does this by expanding the reach of the Assistant, a menu that guides you through your analysis and helps you interpret your results with confidence.

Man, I wish the Assistant had been there when I started my career.

The Assistant can guide you through 9 types of analysis. But what if you don't remember what any of those analyses do? No problem. The Assistant's tool tips explain exactly what each analysis is used for, in plain language.

If I had data about the durability of four kinds of paper, the explanation of Hypothesis Tests would grab my attention:

Of course, if you already know a thing or two about statistics, you know there's more than one kind of hypothesis test. The Assistant guides you through a decision tree so you can identify the one that's right for your situation, based on the kind of data you have and your objectives. If you can't answer a question, the Assistant provides information so you can respond correctly, such as illustrated examples that help you understand how the question relates to your own data.

The Assistant leads me to One-way ANOVA to compare my paper samples.

Now I know where to start my analysis. But I still face....

Fear #2: I Don't Know Enough about Statistics to Get All the Way Through this Analysis.

Getting started is great, but what if you're not sure how to continue?

Fortunately, after you've chosen the right tool, the Assistant comes right out and tells you how to ensure your analysis is accurate. For example, it offers you this checklist for doing a one-way ANOVA:

The Assistant provides clear guidelines, including how to set up, collect, and enter your data, and more.

What's more, the Assistant's dialogs are simple to complete. No need to guess about what you should enter, and even relatively straightforward concepts like Alpha value are phrased as common-sense questions: "How much risk are you willing to accept of concluding there are differences when there are none?"

The Assistant will help you finish the analysis you start. But my biggest fear about data is still waiting...

Fear #1: If I Reach the Wrong Conclusion, I'll Make a Fool of Myself!

Once you finish your analysis, you must interpret what it means, and then you usually need to explain it to other people.

This is where the Assistant really shines, by providing a series of reports that help you understand your analysis.

Take a look at the summary report for my ANOVA below and tell me if the means of my four paper samples differed.

The bar graph in the left corner explicitly tells me YES, the means differ, and it gives me the p-value, too...but I don't need to interpret that p-value to draw a conclusion. I don't even need to know what a p-value is. I do know what's important: that the means are different.

This summary report also tells me which means are different from each other. With this report, I could tell my boss that we should avoid paper #2, which has a low durability compared to the others, but that there's not a statistically significant difference in durability between papers 1, 3, and 4, so we could select the least expensive option.

In my early career, a tool like this would have made all the difference when questions about data came up. I wouldn't have needed to avoid it.

What If I'm Already an Experienced Data Analyst?

Today I know enough about data analysis that I could easily run the ANOVA without the Assistant, but I still like to use it. Why? Because the simplicity and clarity of the Assistant's output and reports is perfect for communicating the results of my analysis with people who fear statistics the same way I used to.

And as you probably know, there are lots of us out there.

I hope you'll give the 30-day trial version of Minitab 17 a try, and let us know how you like it. We've even put together a series of fun statistical exercises you can do with the Assistant to get started.

Minitab 17 came out yesterday and it’s got quite a few neat features in it. You can check some of them out on the What’s New in Minitab 17 page. But one of my very favorite things is related to one of my previous blog posts that showed how to make histograms that are easy to compare. Turns out, you don’t need those steps anymore. You can do it all with Minitab’s Assistant.

Here’s how to open the data that I’m using if you want to follow along.

Choose File > Open Worksheet.
Click Look in Minitab Sample Data Folder.
Select Cap.MTW and click Open.

You can still rearrange a paneled histogram to make the groups easy to compare. But with Minitab 17, you can get histograms that are easy to compare right from the Assistant.

In Minitab 17, choose Assistant > Graphical Analysis.
Click Histogram.
In Y column, enter Torque.
In X column, enter Machine. Click OK.

You just cut in half the number of steps you need to get histograms that are easy to compare.

Not only are the histograms arranged in a way that makes comparisons easy, but the Assistant’s Diagnostic Report reminds you about some of the important differences you can see. Do the data have the same center? Do the data have the same variability? Do both the center and variability differ?

In this case, it looks like the center and variability both vary between the groups.

The diagnostic report makes it easy to compare histograms and reminds you what's important to compare.

By the way, that Diagnostic Report is also new in Minitab 17. Additional new features in the Assistant in Minitab 17 include Design of Experiments (DOE) and multiple regression.

Want to see more about what the Assistant can do in Minitab 17? Check out these Quick Start exercises that will help you to become a fearless data analyst.

We released Minitab 17 Statistical Software a couple of days ago. Certainly every new release of Minitab is a reason to celebrate. However, I am particularly excited about Minitab 17 from a data analyst’s perspective.

If you read my blogs regularly, you’ll know that I’ve extensively used and written about linear models. Minitab 17 has a ton of new features that expand and enhance many types of linear models. I’m thrilled!

In this post, I want to share with my fellow analysts the new linear model features and the benefits that they provide.

New Linear Model Analyses in Minitab 17

We’ve added several brand-spanking new analyses in Minitab 17! These represent major additions to the types of data that you can analyze, the types of studies you can perform, and how to present the results.

Poisson Regression: If you have a response variable that is a count, you need Poisson Regression! For example, use Poisson Regression to model the count of failures or defects.

Stability Studies: Analyze the stability of a product over time and determine its shelf life. We’ve even included a worksheet generator to create a data collection plan! For example, use Stability Studies to model drug effectiveness by batch across time.

Binary Fitted Line Plot: This plot is similar to the existing fitted line plot, but for binary response variables. If you have a single predictor and need to graph the event probabilities for a binary response, this new graph effectively presents this information more clearly than ever!

Consistent Features across Linear Model Analyses

A huge benefit of Minitab 17 is that the interface and functionality have been both improved and standardized across many types of models. Previously, some features were only available for a specific type of linear model. For example, you could only perform stepwise regression in Regression, and you could only use the Response Optimizer in DOE.

In Minitab 17, we made significant improvements to the following types of linear models:

Factorial Designs
General Full Factorial Designs
Analyze Variability
Response Surface Designs
General Linear Models (GLM)
Regression
Binary Logistic Regression
Poisson Regression

Thanks to the standardization across model types, all of the features I describe below apply to all of the above analyses! Pretty cool!

Easier to Specify Your Model

If your linear model has a lot of interactions and higher-order terms, you’ll love our new and improved interface for specifying the terms you need in your model. Additionally, you now have the ability to specify non-hierarchical models if you choose.

As an added convenience, we’ve also added the stepwise model selection procedure to all of the linear model analyses that I listed above. You also have greater control over how Minitab adds and removes terms from your model during this procedure.

Automatically Store Models for Later Use

It’s now a piece of cake to specify the best model for your data, but that’s often just the first step. If you need to use your model for additional tasks, you’ll come to love our stored models because they make it easy to perform important follow-up analyses!

In order to improve your workflow, we’ve introduced both the automatic ability to store models and a set of post-analyses to use with the stored models.

Every time you fit a model for one of analyses listed above, it gets stored right in the worksheet. After you settle on the perfect model, you can use the stored model to perform all of the post-analyses tasks below.

Post-Analyses that use Stored Linear Models

The tasks below were previously only available for specific model types. For Minitab 17, we not only made them available for all of the linear models listed above, using a consistent interface, but in many cases we also enhanced the functionality!

Prediction

There’s a cool new interface that makes it much easier to enter the values that you want to predict!

Factorial Plots

Main effects plot from Minitab 17 The main effects plot and interactions plot are now available for types of linear models that they weren't before. Even cooler is the new ability to produce these plots for continuous variables, rather than just categorical variables! This ability is particularly helpful when you want to understand interactions between continuous variables, which are to notoriously difficult to interpret using the numerical output.

This main effects plot is based on continuous variables. Notice how it reflects the curvature in the model!

Contour Plots and Surface Plots

Surface Plot from Minitab 17 You can now generate both of these plots based on your stored model for all the model types listed above. The z-axis generally represents the fitted response value. However, what’s super cool is that for binary logistic regression, the z-axis represents the expected event probability!

In this surface plot based on a binary logistic model, we see how students’ finances affect their probability of carrying a credit card.

Overlaid Contour Plots

Overlaid Contour Plot from Minitab 17 The overlaid contour plot based on multiple models is now available for all of the listed model types. Overlaid contour plots display contour plots for multiple responses in a single graph.

Applications that involve multiple response variables present a different challenge than single response studies. Optimal variable values for one response may be far from optimal for another response. Overlaid contour plots allow you to visually identify an area of compromise among the various responses.

In this overlaid contour plot, the white region represents the combination of predictor values that yield satisfactory fitted values for both response variables.

Response Optimizer

Optimization Plot from Minitab 17 The Response Optimizer is now available for all of the listed model types. This tool is an even more advanced method to identify the combination of input variables that jointly optimize a set of response variables. Minitab calculates an optimal solution and produces the interactive optimization plot.

This plot allows you to interactively change the input variable settings to perform sensitivity analyses and possibly improve upon the initial solution. The session window output contains more detailed information about the optimal settings and the predicted outcomes.

These are the highlights of the new linear model features. There are many more new features in other areas of Minitab 17. You can read about them all in What’s New in Minitab 17.

software validation in pharmaceutical industry is critical We're frequently asked whether Minitab has been validated by the U.S. Food and Drug Administration (FDA) for use in the pharmaceutical and medical device industries.

Minitab does extensive testing to validate our software internally, but Minitab’s statistical software is not—and cannot be—FDA-validated out-of-the-box.

Nobody's can.

It is a common misconception that software vendors can go through a certification process to achieve FDA software validation. It's simply not true.

Software vendors who claim their products are FDA-validated should be scrutinized. It is up to the software purchaser to validate software used in production or as part of a quality system for the “intended use” of the software. This is described in FDA’s Code of Federal Regulations Title 21 Part 820.70(i):

“When computers or automated data processing systems are used as part of production or the quality system, the manufacturer shall validate computer software for its intended use according to an established protocol. All software changes shall be validated before approval and issuance. These validation activities and results shall be documented.”

The FDA provides additional supportive information for medical device companies via Section 6 of “Validation of Automated Process Equipment and Quality System Software” in the Principles of Software Validation; Final Guidance for Industry and FDA Staff, January 11, 2002.

“The device manufacturer is responsible for ensuring that the product development methodologies used by the off-the-shelf (OTS) software developer are appropriate and sufficient for the device manufacturer's intended use of that OTS software. For OTS software and equipment, the device manufacturer may or may not have access to the vendor's software validation documentation. If the vendor can provide information about their system requirements, software requirements, validation process, and the results of their validation, the medical device manufacturer can use that information as a beginning point for their required validation documentation.”

Intended Use

There is good reason for the “intended use” guidance. Here is an example:

Company XYZ is using Minitab to estimate the probability of a defect in a manufacturing process. If the amount of an impurity exceeds 450 mg/mL, the product is considered defective. Let’s say they use Minitab’s Capability Analysis > Normal to perform the capability analysis.

In the first graph below, you can see the Ppk (1.26) and defect rate (82 defects per million) are quite good by most standards. However, this manufacturer would be misled into believing this is a good process based on these numbers.

Not Validated for Non-Normal Capability Analysis

Minitab has not calculated anything incorrectly, but the since this data is non-normal, the wrong procedure was applied. If this was the only capability analysis available in Minitab, then the software would not be validated for non-normal capability analysis.

Fortunately, Minitab does have non-normal capability analysis. If the Capability Analysis > Nonnormal is chosen and an appropriate distribution is selected (Weibull in this case), the Ppk (0.69) and defect rate (8993 defects per million) are found to be poor, as shown in the following graph:

Validated for Non Normal Capability Analysis

What Needs To Be Validated?

Software packages that are used to monitor the process and determine the quality level, such as Minitab, should be validated. To validate Minitab, you will need to document the “intended use.”

The validation for intended use consists of mapping the software requirements to test cases. Each requirement must trace to a test case. An auditor may find that a system “has not been validated” if a requirement is discovered without a test case.

You can use a Traceability Matrix to track your requirements and test cases.

Traceability Matrix

A test case should contain:

A test case description. For example, validate capability analysis for Non-Normal Data.
Steps for execution. For example, go to Stat > Quality Tools > Capability Analysis > Nonnormal and enter the column to be evaluated and select the appropriate distribution.
Test results (with screen shots).
Test pass/fail determination.
Tester signature and date.

Software Validation Warning Letters

Many warning letters received by manufacturers cite a violation of this regulation. Below is a section from a warning letter that refers to the failed validation of an off-the-shelf helpdesk software product, and a document management tool.

5/29/2009
Failure to validate computer software for its intended use according to an established protocol when computers or automated data processing systems are used as part of production or the quality system as required by 21 CFR § 820.70(i). This was a repeat violation from a previous FDA-483 that was issued to your firm. For example:
A) Your firm uses off-the-shelf software (***** Help Desk) to manage customer support service calls and to maintain customer site configuration information; however, your firm failed to adequately validate this software in order to ensure that it will perform as intended in its chosen application. Specifically, your firm's validation did not ensure that the details screen was functioning properly as intended. The details screen is used to capture complaint details and complaint follow-up information which would include corrective and preventative actions performed by your firm when service calls are determined to be CAPA issues.
B) Off-the-shelf software (***************) is being used by your firm to manage your quality system documents for document control and approval. However, your firm has failed to adequately validate this software to ensure that it meets your needs and intended uses. Specifically, at the time of this inspection there were two different versions of your CAPA & Customer Complaint procedure, SOP-200-104; however, no revision history was provided on the *************** document history. Your firm has failed to validate the *************** software to meet your needs for maintaining document control and versioning.

Here are two more examples of software validation violations in an FDA warning letter:

3/25/2010
“Failure to validate computer software for its intended use according to an established protocol when computers or automated data processing systems are used as part of production or the quality system, as required by 21 CFR 820.70(i).“ … “when requested no validation documentation to support the commercial off-the-shelf program (b)(4) used to capture complaints, returned merchandise and service requests was provided.”

2/25/2010
“Failure to validate computer software for its intended use according to an established protocol when computers or automated data processing systems are used as part of production or the quality system, as required by 21 C.F.R. §820.70(i) (Production and Process Controls – Automated Processes).” … “the CAPA analysis of nonconformances, which is used at management meetings, is inadequate in that the report is computer-generated on a non-validated software system.”

Minitab’s Validation Resources

It's the responsibility of the software purchaser to validate software for its intended use. If you're using Minitab Statistical Software, we offer resources to help with your validation. You can download Minitab’s software validation kit here:

http://www.minitab.com/support/software-validation/

And you can find additional information about validating Minitab relative to the FDA guideline CFR Title 21 Part 11 at this link:

http://it.minitab.com/support/answers/answer.aspx?id=2588

This software validation kit was created to help you understand how we validate Minitab Statistical Software for market readiness, and to confirm Minitab’s continued commitment to quality.

If you have any questions about our software validation process, please contact us.

"You take 10 parts and have 3 operators measure each 2 times."

This standard approach to a Gage R&R experiment is so common, so accepted, so ubiquitous that few people ever question whether it is effective. Obviously one could look at whether 3 is an adequate number of operators or 2 an adequate number of replicates, but in this first of a series of posts about "Gauging Gage," I want to look at 10. Just 10 parts. How accurately can you asses your measurement system with 10 parts?

Assessing a Measurement System with 10 Parts

I'm going to use a simple scenario as an example. I'm going to simulate the results of 1,000 Gage R&R Studies with the following underlying characteristics:

There are no operator-to-operator differences, and no operator*part interaction.
The measurement system variance and part-to-part variance used would result in a %Contribution of 5.88%, between the popular guidelines of <1% is excellent and >9% is poor.

So—no looking ahead here—based on my 1,000 simulated Gage studies, what do you think the distribution of %Contribution looks like across all studies? Specifically, do you think it is centered near the true value (5.88%), or do you think the distribution is skewed, and if so, how much do you think the estimates vary?

Go ahead and think about it...I'll just wait here for a minute.

Okay, ready?

Here is the distribution, with the guidelines and true value indicated:

PctContribution for 10 Parts

The good news is that it is roughly averaging around the true value.

However, the distribution is highly skewed—a decent number of observations estimated %Contribution to be at least double the true value with one estimating it at about SIX time the true value! And the variation is huge. In fact, about 1 in 4 gage studies would have resulted in failing this gage.

Now a standard gage study is no small undertaking—a total of 60 data points must be collected, and once randomization and "masking" of the parts is done it can be quite tedious (and possibly annoying to the operators). So just how many parts would be needed for a more accurate assessment of %Contribution?

Assessing a Measurement System with 30 Parts

I repeated 1,000 simulations, this time using 30 parts (if you're keeping score, that's 180 data points). And then for kicks, I went ahead and did 100 parts (that's 600 data points). So now consider the same questions from before for these counts—mean, skewness, and variation.

Mean is probably easy: if it was centered before, it's probably centered still.

So let's really look at skewness and how much we were able to reduce variation:

10 30 100 Parts

Skewness and variation have clearly decreased, but I suspect you thought variation would have decreased more than it did. Keep in mind that &Contribution is affected by your estimates of repeatability and reproducibility as well, so you can only tighten this distribution so much by increasing number of parts. But still, even using 30 parts—an enormous experiment to undertake—still results in this gage failing 7% of the time!

So what is a quality practitioner to do?

I have two recommendations for you. First, let's talk about %Process. Often times the measurement system we are evaluating has been in place for some time and we are simply verifying its effectiveness. In this case, rather than relying on your small sampling of parts to estimate the overall variation, you can use the historical standard deviation as your estimate and eliminate much of the variation caused by the same sample size of parts. Just enter your historical standard deviation in the Options subdialog in Minitab:

Options Subdialog

Then your output will include an additional column of information called %Process. This column is the equivalent of the %StudyVar column, but using the historical standard deviation (which comes from a much larger sample) instead of the overall standard deviation estimated from the data collected in your experiment:

Percent Process

My second recommendation is to include confidence intervals in your output. This can be done in the Conf Int subdialog:

Conf Int sibdialog

Including confidence intervals in your output doesn't inherently improve the wide variation of estimates the standard gage study provides, but it does force you to recognize just how much uncertainty there is in your estimate. For example, consider this output from the gageaiag.mtw sample dataset in Minitab with confidence intervals turned on:

Output with CIs

For some processes you might accept this gage based on the %Contribution being less than 9%. But for most processes you really need to trust your data, and the 95% CI of (2.14, 66.18) is a red flag that you really shouldn't be very confident that you have an acceptable measurement system.

So the next time you run a Gage R&R Study, put some thought into how many parts you use and how confident you are in your results!

In Part 1 of Gauging Gage, I looked at how adequate a sampling of 10 parts is for a Gage R&R Study and providing some advice based on the results.

Now I want to turn my attention to the other two factors in the standard Gage experiment: 3 operators and 2 replicates. Specifically, what if instead of increasing the number of parts in the experiment (my previous post demonstrated you would need an unfeasible increase in parts), you increased the number of operators or number of replicates?

In this study, we are only interested in the effect on our estimate of overall Gage variation. Obviously, increasing operators would give you a better estimate of of the operator term and reproducibility, and increasing replicates would get you a better estimate of repeatability. But I want to look at the overall impact on your assessment of the measurement system.

Operators

First we will look at operators. Using the same simulation engine I described in Part 1, this time I did two different simulations. In one, I increased the number of operators to 4 and continued using 10 parts and 2 replicates (for a total of 80 runs); in the other, I increased to 4 operators and still used 2 replicates, but decreased the number of parts to 8 to get back close to the original experiment size (64 runs compared to the original 60).

Here is a comparison of the standard experiment and each scenario laid out here:

Operator Comparisons

Operator Descriptive Stats

It may not be obvious in the graph, but increasing to 4 operators while decreasing to 8 parts actually increased the variation in %Contribution seen...so despite requiring 4 more runs this is the poorer choice. And the experiment that involved 4 operators but maintained 10 parts (a total of 80 runs) showed no significant improvement over the standard study.

Replicates

Now let's look at replicates in the same manner we looked at parts. In one run of simulations we will increase replicates to 3 while continuing to use 10 parts and 3 operators (90 runs), and in another we will increase replicates to 3 and operators to 3, but reduce parts to 7 to compensate (63 runs).

Again we compare the standard experiment to each of these scenarios:

Replicate Comparisons

Replicates Descriptive Statistics

Here we see the same pattern as with operators. Increasing to 3 replicates while compensating by reducing to 7 parts (for a total of 63 runs) significantly increases the variation in %Contribution seen. And increasing to 3 replicates while maintaining 10 parts shows no improvement.

Conclusions about Operators and Replicates in Gage Studies

As stated above, we're only looking at the effect of these changes to the overall estimate of measurement system error. So while increasing to 4 operators or 3 replicates either showed no improvement in our ability to estimate %Contribution or actually made it worse, you may have a situation where you are willing to sacrifice that in order to get more accurate estimates of the individual components of measurement error. In that case, one of these designs might actually be a better choice.

For most situations, however, if you're able to collect more data, then increasing the number of parts used remains your best choice.

But how do we select those parts? I'll talk about that in my next post!

In Parts 1 and 2 of Gauging Gage we looked at the numbers of parts, operators, and replicates used in a Gage R&R Study and how accurately we could estimate %Contribution based on the choice for each. In doing so, I hoped to provide you with valuable and interesting information, but mostly I hoped to make you like me. I mean like me so much that if I told you that you were doing something flat-out wrong and had been for years and probably screwed somethings up, you would hear me out and hopefully just revert back to being indifferent towards me.

For the third (and maybe final) installment, I want to talk about something that drives me crazy. It really gets under my skin. I see it all of the time, maybe more often than not. You might even do it. If you do, I'm going to try to convince you that you are very, very wrong. If you're an instructor, you may even have to contact past students with groveling apologies and admit you steered them wrong. And that's the best-case scenario. Maybe instead of admitting error, you will post scathing comments on this post insisting I am wrong and maybe even insulting me despite the evidence I provide here that I am, in fact, right.

Let me ask you a question:

When you choose parts to use in a Gage R&R Study, how do you choose them?

If your answer to that question required anymore than a few words - and it can be done in one word—then I'm afraid you may have been making a very popular but very bad decision. If you're in that group, I bet you're already reciting your rebuttal in your head now, without even hearing what I have to say. You've had this argument before, haven't you? Consider whether your response was some variation on the following popular schemes:

Sample parts at regular intervals across the range of measurements typically seen
Sample parts at regular intervals across the process tolerance (lower spec to upper spec)
Sample randomly but pull a part from outside of either spec

#1 is wrong. #2 is wrong. #3 is wrong.

You see, the statistics you use to qualify your measurement system are all reported relative to the part-to-part variation and all of the schemes I just listed do not accurately estimate your true part-to-part variation. The answer to the question that would have provided the most reasonable estimate?

"Randomly."

But enough with the small talk—this is a statistics blog, so let's see what the statistics say.

In Part 1 I described a simulated Gage R&R experiment, which I will repeat here using the standard design of 10 parts, 3 operators, and 2 replicates. The difference is that in only one set of 1,000 simulations will I randomly pull parts, and we'll consider that our baseline. The other schemes I will simulate are as follows:

An "exact" sampling - while not practical in real life, this pulls parts corresponding to the 5th, 15th, 25, ..., and 95th percentiles of the underlying normal distribution and forms a (nearly) "exact" normal distribution as a means of seeing how much the randomness of sampling affects our estimates.
Parts are selected uniformly (at equal intervals) across a typical range of parts seen in production (from the 5th to the 95th percentile).
Parts are selected uniformly (at equal intervals) across the range of the specs, in this case assuming the process is centered with a Ppk of 1.
8 of the 10 parts are selected randomly, and then one part each is used that lies one-half of a standard deviation outside of the specs.

Keep in mind that we know with absolute certainty that the underlying %Contribution is 5.88325%.

Random Sampling for Gage

Let's use "random" as the default to compare to, which, as you recall from Parts 1 and 2, already does not provide a particularly accurate estimate:

Pct Contribution with Random Sampling

On several occasions I've had people tell me that you can't just sample randomly because you might get parts that don't really match the underlying distribution.

Sample 10 Parts that Match the Distribution

So let's compare the results of random sampling from above with our results if we could magically pull 10 parts that follow the underlying part distribution almost perfectly, thereby eliminating the effect of randomness:

Random vs Exact

There's obviously something to the idea that the randomness that comes from random sampling has a big impact on our estimate of %Contribution...the "exact" distribution of parts shows much less skewness and variation and is considerably less likely to incorrectly reject the measurement system. To be sure, implementing an "exact" sample scheme is impossible in most cases...since you don't yet know how much measurement error you have, there's no way to know that you're pulling an exact distribution. What we have here is a statistical version of chicken-and-the-egg!

Sampling Uniformly across a Typical Range of Values

Let's move on...next up, we will compare the random scheme to scheme #2, sampling uniformly across a typical range of values:

Random vs Uniform Range

So here we have a different situation: there is a very clear reduction in variation, but also a very clear bias. So while pulling parts uniformly across the typical part range gives much more consistent estimates, those estimates are likely telling you that the measurement system is much better than it really is.

Sampling Uniformly across the Spec Range

How about collecting uniformly across the range of the specs?

Random vs Uniform Specs

This scheme results in an even more extreme bias, with qualifying this measurement system a certainty and in some cases even classifying it as excellent. Needless to say it does not result in an accurate assessment.

Selectively Sampling Outside the Spec Limits

Finally, how about that scheme where most of the points are taken randomly but just one part is pulled from just outside of each spec limit? Surely just taking 2 of the 10 points from outside of the spec limits wouldn't make a substantial difference, right?

Random vs OOS

Actually those two points make a huge difference and render the study's results meaningless! This process had a Ppk of 1 - a higher-quality process would make this result even more extreme. Clearly this is not a reasonable sampling scheme.

Why These Sampling Schemes?

If you were taught to sample randomly, you might be wondering why so many people would use one of these other schemes (or similar ones). They actually all have something in common that explains their use: all of them allow a practitioner to assess the measurement system across a range of possible values. After all, if you almost always produce values between 8.2 and 8.3 and the process goes out of control, how do you know that you can adequately measure a part at 8.4 if you never evaluated the measurement system at that point?

Those that choose these schemes for that reason are smart to think about that issue, but just aren't using the right tool for it. Gage R&R evaluates your measurement system's ability to measure relative to the current process. To assess your measurement system across a range of potential values, the correct tool to use is a "Bias and Linearity Study" which is found in the Gage Study menu in Minitab. This tool establishes for you whether you have bias across the entire range (consistently measuring high or low) or bias that depends on the value measured (for example, measuring smaller parts larger than they are and larger parts smaller than they are).

To really assess a measurement system, I advise performing both a Bias and Linearity Study as well as a Gage R&R.

Which Sampling Scheme to Use?

In the beginning I suggested that a random scheme be used but then clearly illustrated that the "exact" method provides even better results. Using an exact method requires you to know the underlying distribution from having enough previous data (somewhat reasonable although existing data include measurement error) as well as to be able to measure those parts accurately enough to ensure you're pulling the right parts (not too feasible...if you know you can measure accurately, why are you doing a Gage R&R?). In other words, it isn't very realistic.

So for the majority of cases, the best we can do is to sample randomly. But we can do a reality check after the fact by looking at the average measurement for each of the parts chosen and verifying that the distribution seems reasonable. If you have a process that typically shows normality and your sample shows unusually high skewness, there's a chance you pulled an unusual sample and may want to pull some additional parts and supplement the original experiment.

Thanks for humoring me and please post scathing comments below!

Regression Analysis: How to Interpret S, the Standard Error of the Regression

Minitab illustrates the need for fire safety

Setting the Stage: Accounting for Process Changes in a Control Chart

Winning a Super Bowl Grid Pool: Frequency of Score Combinations in the NFL

A Statistical History of the Super Bowl

Applying Six Sigma to a Small Operation

Merge Data as You Open It

Lessons in Quality During a Long and Strange Journey Home

Applying Six Sigma to a Small Operation, Part 2

Busting a Myth about the NHL and Olympic Games

Are Atlanta's Winters Getting Colder and Snowier?

Say "I Love You" with Data on Valentine's Day

R-Squared: Sometimes, a Square is just a Square

(We Just Got Rid of) Three Reasons to Fear Data Analysis

Histograms are Even Easier to Compare in Minitab 17

Unleash the Power of Linear Models with Minitab 17

Is Your Statistical Software FDA Validated for Medical Devices or Pharmaceuticals?

Gauging Gage Part 1: Is 10 Parts Enough?

Gauging Gage Part 2: Are 3 Operators or 2 Replicates Enough?

Gauging Gage Part 3: How to Sample Parts