Quantcast
Channel: Minitab | Minitab
Viewing all 828 articles
Browse latest View live

Parity in the NFL? Nope! It’s the Sample Size!

$
0
0

NFL Parity GraphicIt's almost Super Bowl Sunday, and this year’s matchup pits the Baltimore Ravens against the San Francisco 49ers. The 49ers are no huge surprise, as they were favored in both of their playoff games. However the Ravens had to win 3 games, pulling two major upsets along the way, to get to the Super Bowl. It marks the 8th time in the last 10 years that a team that played on Wild Card Weekend advanced the entire way to the Super Bowl. This again shows how much parity there is in the NFL. It’s unpredictable! Any team can win the championship!

Well...not quite. While I agree that the NFL playoffs are unpredictable, it’s not because of parity. It’s all in the sample size.

Small Samples

In statistics we’re always being warned about making conclusions based on small sample sizes. Small samples have wider confidence intervals, meaning there’s more variation surrounding the estimates. Therefore, it’s harder to make conclusions based on them. If we want to determine whether a new ice cream flavor tastes good or bad, would we sample one person and base our conclusions on their opinion? Of course not! We’d sample multiple people and determine the proportion of people who thought it was good.

The NFL playoffs try to determine the “best team” by taking a sample size of 1 game and having the winner advance. So of course there is going to be a ton of variation in the results! And this makes it appear that just about anybody can beat anybody else, and thus the idea of parity.

But what if we increased the sample size?

What Would the NFL Playoffs Look Like with a 7-Game Series?

Let’s determine how much of a difference a 7-game series would make in the Super Bowl compared to a single game. The first thing we have to do is figure out each team's probability of winning the game. To do this, I’m going to use the final Las Vegas betting spread on the game. Now, I know the spread is not set by casinos to predict the most likely outcome but instead is meant to estimate where they can get 50% of bettors to take one team and 50% to take the other. I am going to use it anyway, because it turns out there is usually little difference between the two.

So let’s say the spread is 0. That means that teams are evenly matched and each has a 50% chance of winning. But the spread in the Super Bowl has the 49ers favored by 3.5 points (at the time of this writing). Obviously this means they have greater than a 50% chance of winning the game, but by how much? 55%? 60%? Luckily, the site thepredictiontracker.com has tracked the outcome of every NFL game against the spread. You’ll see on that page that the Mean Square Error for the spread (called “Line (updated)”) is 194. This represents the variance in the spread. We can take the square root to get the standard deviation, which is 13.9. We can then use Minitab to make a normal probability plot of the distribution, and use the area under the curve to determine the probability a team has of winning.

Okay, I know that was a lot right there, so I’ll walk through a simple example. Let’s start with a game between two teams where the spread is 0.

Dialog

Normal Probability Plot

So you’re looking at the distribution of scores between two NFL teams that are evenly matched. The distribution is centered at 0, since that is the spread. Any value to the right of 0 means team A wins, and any values to the left of 0 means team B wins. We can find the area beneath the curve to the right of 0 to find out the probability that team A wins.

Dialog shading

Shaded normal probability plot

The shaded area represents all of the outcomes that result in a victory for team A. You see that when you find the area of the shaded region, you get .5. This equals the probability of team A winning, which is 50%. Now we can do the same thing for the Super Bowl, except we’ll make the spread 3.5 instead of 0.

Dialog spread 3.5

Super Bowl Probability

And we finally see that the 49ers have about a 60% chance of winning the Super Bowl. But what if the NFL changed it to a Super Series? That is, what if instead of 1 game, the 49ers and Ravens played a best of 7 series? What would happen to the 49ers' chances of winning then?

To figure that out, we are going to switch gears from a normal distribution to a negative binomial distribution. The negative binomial distribution can model the number of trials necessary to produce a specified number of a desired outcome. In this case our desired outcome is the 49ers winning 4 games (which would mean they win the series). Then we find the probability that it takes them 7 or less games to do it. The Super Bowl is played at a neutral site, so we can use the same probability for each game.

Negative Binomial Dialog

Negative Binomial Plot

We see that the probability of the 49ers winning rises to 71% in a 7-game series. That’s a pretty significant increase! The reason being it’s a lot easier for the underdog to pull a single upset than having to do it 4 times in 7 games! Baltimore can consider itself lucky that it only has to win one game. And not just in the Super Bowl: they’re lucky they didn’t have to beat Denver or New England in a 7 game series, either! The table below shows the difference between a single game and a 7-game series for each of the playoff opponents Baltimore has already defeated.

NOTE: For the 7-game series before the Super Bowl, Baltimore would play both home games and away games, which would change their probability of winning since the home field advantage is worth 3 points. I calculated their probability of winning at both home and the road, then used a weighted average (3/7 for the underdog home probability, 4/7 for the favorite home probability) to calculate a constant probability I could use for the negative binomial plot. The favored team would play game 7 at home, which is the reason I used a weighted average.

Opponent

Indianapolis

Denver

New England

Spread

Baltimore by 6.5

Denver by 10

New England by 8

Baltimore victory in a single game

68%

23.6%

28.25%

Baltimore victory in a 7 game series

72.7%

12.6%

20.2%

The 7 games series would have helped Baltimore slightly against Indianapolis, but it would have significantly hurt their chances against Denver and New England. So as you can see, by simply changing the format of the playoffs, you can significantly alter the probabilities of who advances. The single-game format gives the underdog a much better chance of winning, and thus the appearance of parity.

But let’s not stop there! As fun little example, here are the last 10 champions of the NHL, MLB, and NBA, and how their season would have ended if their playoffs had a one-and-done format.  If they won the first game in their series, they advanced. Otherwise, they were eliminated.

NBA

Year

NBA Champion  

Result in one-and-done   

2012

Miami Heat

Lost in Finals

2011

Dallas Mavericks

Lost in Finals

2010

LA Lakers

Won Championship

2009

LA Lakers

Lost in 2nd round

2008

Boston Celtics

Won Championship

2007

San Antonio Spurs

Lost in 1st round

2006

Miami Heat

Lost in 2nd round

2005

San Antonio Spurs

Lost in 1st round

2004

Detroit Pistons

Lost in Conference finals

2003

San Antonio Spurs

Lost in 1st round


Only 2 of the last 10 NBA champions would have still won the title if they had a one-and-done format. Oh, and you remember those great Spurs teams with Tim Duncan and company? Instead of being called champions with multiple hall of fame players, they’d be a bunch of choke artists that couldn’t get out of the first round. And yet, they’d be the exact same team! Isn’t sports analysis fun?

NHL

Year

Stanley Cup Winner  

Result in one-and-done  

2012

LA Kings

Won Stanley Cup

2011

Boston Bruins

Lost in 1st round

2010

Chicago Blackhawks

Lost in 1st round

2009

Pittsburgh Penguins

Lost in 2nd round

2008

Detroit Red Wings

Won Stanley Cup

2007

Anaheim Ducks

Lost in Conference Finals

2006

Carolina Hurricanes

Lost in 1st round

2004

Tampa Bay Lightning

Lost in Finals

2003

New Jersey Devils

Lost in Conference Finals

2002

Detroit Red Wings

Lost in 1st round

Four different Stanley Cup champions would have lost in the 1st round. And another 3 wouldn’t have even reached the finals! The NHL playoffs seem like a crapshoot as it is (only 1 repeat champion in the last 10 years), so I can’t imagine the chaos that would result from a one-and-done format.   

MLB

Year

World Series Champion  

Result in one-and-done  

2012

San Francisco Giants

Lost in 1st round

2011

St. Louis Cardinals

Lost in 1st round

2010

San Francisco Giants

Won World Series

2009

New York Yankees

Lost in World Series

2008

Philadelphia Phillies

Won World Series

2007

Boston Red Sox

Won World Series

2006

St. Louis Cardinals

Lost in NLCS

2005

Chicago White Sox

Lost in ALCS

2004

Boston Red Sox

Lost in ALCS

2003

Florida Marlins

Lost in 1st round


Six of the last 10 MLB champions wouldn’t have even reached the World Series if they had a one-and done. And keep in mind they only had to win 2 games to get to the championship, unlike basketball and hockey where they have to win 3. And the Red Sox would have had to wait until 2007 to break the curse of the Bambino! Assuming, of course, the city wouldn’t have imploded after losing to the Yankees yet again in 2004.

So if you’re keeping track at home, a mere 7 out of 30 champions would have still won with a one-and-done playoff format. That’s a solid 23%.

Talk about parity!

How About this Sample Size?

Over the past 10 years, the Yankees have a winning percentage of .596 during the regular season. Five different NFL teams have a winning percentage above that over the same time period. The teams are the Ravens (.606), Packers (.613), Steelers (.644), Colts (.700) and Patriots (788). In fact, over the past 10 years the Patriots have gone 126-34 in the regular season. The MLB record for most wins in a season is held by the 2001 Seattle Mariners, who went 116-46. So in their last 160 games, the Patriots have won 10 more games than any major league baseball team ever, and the Mariners had 2 extra games!

How about the NBA? Over the last 10 years the Spurs have the highest winning percentage at .706. Yea, that same Spurs team that’s great during the regular season then chokes in the playoffs (at least in a bizzaro world). That’s pretty good, but it still doesn’t beat the Patriots. In fact, even 10 years of the Jordan-era Bulls (1989-1998) doesn’t beat the Patriots. They only had a winning percentage of .722. And taking the Bulls regular season record during the 6 seasons they won NBA championships, you get a winning percentage of .789, which just barley beats the Patriots. If the NFL playoffs played 7 game series, who knows how many championships New England would have! And you think you're sick of the Patriots now!

So next year, when you will most certainly see upsets in the NFL playoffs, realize that it has nothing to do with parity. Yes, on any given Sunday, anything can happen. But 4 out of 7 Sundays? That would be a completely different story.

 

Image "Parity 2012, The Circle of Life in the NFL" by Lvl9LightSpell


Violations of the Assumptions for Linear Regression: Residuals versus the Fits (Day 3)

$
0
0

day 3

Lionel Loosefit has been hauled to court for violating the assumptions of linear regression. On Day 3 of the trial, the court examines the allegation that the residuals in Mr. Loosefit's model exhibit nonconstant variance. The defendant’s mother, Mrs. Lottie Loosefit, has taken the stand on behalf of her son.

Defense Attorney: So, Mrs. Loosefit, from what you’ve described to us, your son, Lionel, appears to have been a model child.

Lottie Loosefit [eyes watering]: He was every mother’s dream. He brushed his teeth every morning and every night, made his bed, folded his socks, picked up all his toys…

Defense Attorney: It sounds like he satisfied all of your assumptions.

Lottie Loosefit: Every single one. Would you like to see more baby pictures?

Defense Attorney: Thank you, Mrs. Loosefit, I think the evidence you’ve given is sufficient. No further questions, your Honor.

Judge [to prosecutor]: You may cross-examine the witness.

Prosecutor: A mother’s love is always touching, Mrs. Loosefit. So unconditional. You must have raised your son well. Certainly you provided for him well. You run a successful business, don’t you Mrs. Loosefit?

Lottie Loosefit: I certainly do. I started out as a seamstress. Today I am sole owner and CEO of Loosefit Garments, Inc.

Prosecutor: Kudos to you. We all have a pair of Loosefits in our closet. “If you can’t bend over backwards, it’s not a Loosefit.” Catchy jingle.

Lottie Loosefit: Thank you. You don’t get to the top sitting on your bottom.

Prosecutor: Mrs. Loosefit, suppose you’re making a sleeve for a customer. You wouldn’t want the material to be extremely tight at the elbows, yet sag and droop at other parts of the sleeve, would you?

Lottie Loosefit: Of course not! I’d be a laughingstock.

Prosecutor: In that case, without having any background in statistics, you can understand how a regression model should fit. Take a look at the illustration below:

resids andfits

Prosecutor: Here, the blue line shows the regression model. The red circles are the observed values in the data set. The black squares show the fitted value estimated by the model for each observed value.

Lottie Loosefit: I wouldn’t know about all that rigamarole.

Prosecutor: No, Mrs. Loosefit, why would you? But think of it this way. Just as a sleeve should fit the arm consistently, the distance of the observed values from the fitted values of the model--the dotted lines--should vary consistently across the fitted values of the model.

Lottie Loosefit: Our clothes fit all our models very well, thank you.

Prosecutor: I’m sure they do. You probably use a measuring tape to make sure the fit is consistent, don’t you? But in Minitab, you use a plot of the residuals vs the fitted values. If the model errors vary consistently across the fitted values, as they should, the plot will look something like this:

resids vs fits

Lottie Loosefit: Hmppph. Who ever laid eyes on an arm that bony and that straight? And I'm no fan of red polka dots either.

Prosecutor: Mrs. Loosefit, do you ever get complaints from customers about fit?

Lottie Loosefit: Cripes, yes. “Oh, this is too puffy at the chest.” “Oh, this is too tight in the rear.” “Oh this” and “Oh that!” People whine if their ice cream is cold, if you ask me.

Prosecutor: We can all sympathize, Mrs. Loosefit. Total customer satisfaction is no cakewalk, after all. Unfortunately, when it comes to the assumption of constant variance in regression, several potential problems with fit can arise as well:

fit problems

Lottie Loosefit: So what? You go looking for problems, you’ll find ‘em! What’s that got to do with my Lionel?

Prosecutor: Well, Mrs. Loosefit, take a look at Exhibit F, the plot of the residuals vs fits for your son’s regression model.

Exhibit F

Ph.D. statistician [shouts from gallery]:  Monster! Heteroscedastic fiend!

Judge: Order!

Prosecutor: Who can blame such righteous indignation? The plot shows a pathetic mishmash of all the fit problems that we just saw, doesn’t it? Somehow your son managed to break every rule in the book, Mrs. Loosefit. And the small data set he used only served to amplify the hideous nature of his transgressions.

Lottie Loosefit [turns to jury box]: My Lionel never done these awful things! I know my boy. This is somebody else’s doing!

Prosecutor: Sorry, Mrs. Loosefit. We’ve clicked the History folder in your son’s Minitab project. The folder automatically records all the commands a user runs in each Minitab session. The evidence is irrefutable:

history

Prosecutor [smiles and turns to defendant]: You didn’t know about the History folder, did you Mr. Loosefit? You thought you didn’t leave a trail. But in fact, if we wanted to, we could actually use the commands in that folder to recreate your crime, over and over again. Not that that's something we’d ever want to automate.

[Walks over to jury box]

Prosecutor: Notice ladies and gentlemen, that in the History window, you don't see a command to display graphs with the regression analysis. That proves that the defendant didn’t even bother to display residual plots to check the assumptions of the analysis!

Lottie Loosefit: No, you’re wrong! Lionel was at home with me the whole time it happened—”

Prosecutor: Nice try, Mrs. Loosefit. But we’ve also got a record of the time and date when the crimes occurred, at the top of the Session window in your son’s Minitab project.

session window

Prosecutor: According to witnesses, you were on the floor of the Loosefit Garment factory at that date and time. Also note that the Session window shows that the Minitab project was checked out directly from your son’s desktop.

[Oohs and ahs emanating from the gallery]

Spectator 1 [whispers]: Things ain’t lookin’ none too good for Loosefit…

Spectator 2 [whispers]: Aye, not even a Box-Cox transformation can save him now…

Next time: The dramatic conclusion of the trial, as the jury delivers its verdict.

 

Trend Analysis: Super Bowl Ticket Prices

$
0
0

Super BowlTickets to attend the Super Bowl are among the most coveted sports tickets in the world, and the high average price for even nose-bleed seats illustrates how low supply and high demand can result in  astronomical ticket pricing.

After last year’s Super Bowl, the Bleacher Report published this article that ranked the average ticket price of every Super Bowl since 1966: http://bleacherreport.com/articles/1041324-ranking-the-average-ticket-price-of-every-super-bowl-since-1966. Can you believe that the average cost for a ticket 1966 was only $12.00? The pricing for the 2013 Super Bowl tickets range from $800 - $1,200, which is right on par with last year’s average cost.

Let’s take a look at the ticket price data in Minitab and see how we can use trend analysis to predict future ticket prices.

How have ticket prices changed over the years?

For the sake of simplicity, let’s focus on the average ticket prices for Super Bowls from 1985-2006. To view a simple snapshot of how the ticket prices have changed over the years, a time series plot clearly illustrates a steady increase in the ticket cost over time:

Minitab Time Series Plot

(To graph your data using a time series plot in Minitab, choose Graph > Time Series Plot.)

Forecasting Future Ticket Costs with Trend Analysis

Say we want to forecast future Super Bowl ticket prices based on the time series data we already have. In Minitab, we can perform a trend analysis (Stat > Time Series > Trend Analysis) to help us forecast.

What the heck is trend analysis, anyway? Trend analysis fits a general trend model to time series data and provides forecast projections. In Minitab, you can choose among the linear, quadratic, exponential growth or decay, and S-curve models to find the best fit for your data.

Since we have data from 1985-2006, let’s use trend analysis and a growth curve model to forecast ticket prices from 2007-2012:

Minitab Trend Analysis

The trend plot shows the original data, the fitted trend line, and forecasts. The trend model (in red) fits well to the current data (in black), revealing a general upward trend in ticket pricing. The Minitab graph and output below displays the fitted trend equation, as well as the three accuracy measures to help you determine the accuracy of the fitted values (MAPE, MAD, and MSD).

Minitab Output

Take a look at how the real average prices for 2007-2012 stack up against the forecasted prices:

comparison

Pretty close! Even though we already knew the average prices for the years we forecasted, this example illustrates that trend analysis can be pretty darn good at providing accurate forecasts for time periods in the short-term future. Plus, trend analysis is the closest any of us will probably get to playing the role of fortune teller!

Many thanks to fellow Minitab blogger Eduardo Santiago for sharing his Minitab project with me, and then allowing me to share it with you! The data and the Minitab worksheet can be downloaded here.

How much would you pay for a ticket to the Super Bowl?

The Best Super Bowl Commercials of 2013, Plotted in Minitab

$
0
0

Various commercials valiantly vied for the attention and dollars of football Super Bowl commercial fans on Sunday. The game was decided objectively (or by the referees), but the drama of which commercials won lives on. Because we love data analysis, I gathered a little bit to see which efforts really stood out. Then I plotted the data in Minitab to explore the results.

Three top-ten lists attracted my attention:

  • True Reach by Visible Measures: The number of views on any website
  • bluefin labs: The number of mentions on Twitter and public Facebook comments
  • TiVo Rank: The rank for the number of customers watching at “play” speed relative to the surrounding 15 minutes of programming

Here’s those data in Minitab’s individual value plots:


Top-10 lists for True Reach, bluefin, and TiVo

Toyota finishes first in terms of the number of people watching on the internet. While I’m sure many will credit the strong performance of the teaser video and the presence of actress Kaley Cuoco, I have a hypothesis that many sympathetic fathers who wish the old spare tire was gone are taking solace in knowing that someone understands.

Dodge comes in first in terms of people willing to share on Twitter and Facebook. Even in an event filled with astronauts, I guess some of us still want to be farmers.

TiVo didn’t give out their numbers, just the ranking, so the individual value plot is extraordinarily orderly. But Taco Bell should feel good about the number of folks who wanted to viva young at “play” speed.

Congratulations are due to the auto industry, which had one of its members finish number 1 in two categories. Congratulations are also due to Doritos for being the only brand with two commercials on a list and to GoDaddy.com for being the only brand to appear on each list.

Of course, these measures are about behavior. Other companies mesure how people feel about the advertisements that they watch. One such company is ACE metrix, which uses a sample of respondents to score commercials based on persuasion and watchability. USA Today runs the Ad Meter poll, which, for the first time in its 25-year history, was the result of online voting only. If you signed up, you could vote. The people who volunteered for USA Today and the people ACE metrix asks tend to agree in general, but they don’t agree with the best commercials based on activity.

This scatterplot shows the USA Today Ad Meter results against the ACE score with the top finishers I congratulated from the behavioral data sets plus Budweiser as a new standout:

Scatterplot of USA Today Ad Meter Poll and ACE metrix score

 

Apparently, a sentimental story about a man and his horse can bring on a powerful thirst, get people to watch online, and talk about how it made them tear up on Facebook; but fewer people want to TiVo it. TiVo aside, top 10 finishes for two behavioral metrics, the highest ACE score, and the highest Ad Meter Poll score mean that Budweiser’s not just the king of beers this year. It’s the king of Super Bowl ads too.

By the way, did like those graphs? To create them in Minitab, check out these tips: The Layout Tool can get you three panels across. Tactical Editing lets you edit and label individual points.

 

Flu Shot Followup: Assessing the Long-Term Benefits of Flu Vaccination

$
0
0

Influenza virusIn my last post, I wrote about the 60% effectiveness rate for flu shots that news media commonly report. The effectiveness is actually a relative measure of the reduction in your flu risk if you’re vaccinated. Relative measures are hard to interpret without additional information. With that in mind, I reanalyzed the data to put it in absolute terms. I found that if you get a flu shot, your average annual risk of getting the flu drops from 7.0% to 1.9%, which is a 5.1% reduction.

I’ve received several requests to look at this over a longer timeframe. After all, flu shots aren’t a one-time thing. Last time, I concluded that if you regularly get the flu shot, you’ll probably spare yourself a week of misery at some point. Let’s use Minitab statistical software to quantify the long-term probabilities!

How We Will Model the Flu Outcomes Over 20 Years

Flu shots are like a retirement investment where you keep investing year after year, and it’s the cumulative effect that magnifies the differences in the annual percentages.

Financial planners often show you two scenarios based on two different courses of action. Return rates, like influenza rates, fluctuate over the years. Consequently, financial planners use a reasonable long-term average that gives you a ballpark idea of the different outcomes.

That’s what I’ll do in the graphs below. The graphs show different outcomes for influenza infections over the course of decades. The underlying assumptions are that the average infection rate is 7.0% for the unvaccinated and 1.9% for the vaccinated. We’ll see what that 5.1% difference translates into over decades.

I hope regular readers of my blog realize that these data are binary (you’re either infected with the flu or not) and that we can use the binomial and geometric distributions to model the outcomes.

Long-Term Comparisons: No Flu Shot versus Regular Flu Shots

The two scenarios that I will compare are never getting the flu shot and always getting the flu shot. We’ll answer two questions to illustrate the different outcomes:

  • How long until I can expect my first case of the flu?
  • How many times can I expect to get the flu?

For both graphs, the left panel is for the no flu shot scneario and the right panel is for the regular flu shot scenario.

How long until I first get the flu?

The first graph uses the geometric distribution to model the number of years until you first get the flu. Each bar represents the probability of getting the flu for the first time in that specific year. The probability for any given year is small, but I’ve shaded the graphs to indicate the number of years until your cumulative probability of getting the flu for the first time is 50/50. I’ve cut the graphs off at 65 years because studies show that flu shots become less effective above that age.

Graph showing how many years until your first case of the flu

If you never get flu shots, you have a 50/50 chance of getting your first case of the flu in 11 years. However, if you always get the flu shot, you have a 50/50 chance after 38 years. By way of comparison, if you don’t get vaccinated, you only have a 6.8% chance of going 38 years or longer without getting the flu.

You can also see how much more your probabilities are front-loaded with the vaccinations, while the right tail is much thicker in the vaccinated panel. You want higher probabilities in the later years, not the earlier years!

How many times will I get the flu?

The second graph uses the binomial distribution to model the number of times you can expect to get the flu over 20 years. Each bar represents the probability of getting the flu a specific number of times over 20 years. I’ve shaded the graphs to indicate the cumulative probability of getting the flu two or more times over 20 years.

Graph showing the number of times getting the flu over 20 years

The big difference that jumps out is the large bar that represents zero cases of the flu if you’re vaccinated. If you’re vaccinated regularly, you have a 68% chance of not getting the flu within 20 years. However, if you’re never vaccinated, you only have a 23% chance of not getting the flu in the same time frame.

You can also see how much the distribution spreads out if you’re not vaccinated. If you’re vaccinated, you only have a 5.4 percent chance that you’ll get the flu two or more times over 20 years. On the other hand, if you never get vaccinated, you’ll have a 41.3% chance of getting the flu two or more times.

Closing Thoughts on the Long-Term Effectiveness of Flu Shots

When you flip a coin 20 times, you’ll never know for sure how many times you’ll get the heads side up. However, using the binomial distribution, you can model the most likely outcomes. The same is true with the flu.

With the flu, there are some additional complications. There are plenty of other viruses circulating that can make you sick. The probabilities in this post are strictly related to you getting the specific flu viruses that the flu shots are designed to protect against—three per shot. Let’s assume that you got the flu shot every year for 20 years, and that you never got a flu that you were vaccinated against. Unfortunately, you may well get another strain of the flu that was not included in a shot, or a different type of virus altogether that causes “flu-like” symptoms.

Also, the graphs show that even if you never get a flu shot, it’s not too unlikely (50/50) that after 11 years you will not yet get the flu. However, don’t take this to mean that there are no benefits to vaccinating yourself. If you are regularly vaccinated, you can expect your first case to occur significantly later and you can expect to have the flu a fewer number of times.

As I said in my last post, I’ll continue to get the flu shot because it’s a simple way to prevent some misery!

They Call Them "Free" Throws For a Reason

$
0
0

looking at free throws with regression analysisWhen Penn State guard Jermaine Marshall stepped to the line to take two free throws with 0:27 remaining against Ohio State, it didn’t really matter whether he made the shots. The game was already out of reach, and although the Nittany Lions would attempt to foul their way into a miracle victory, most of the fans were all too aware that Penn State was now 0-8.  That Marshall then missed both free throws was the exclamation point on a night where the team made just 13 of 22 free throw attempts.

Lest you not already know this, a "free throw" is a shot taken against no defense, a shot that likely had been practiced thousands of times by Jermaine and every other player on the court.

You can hardly blame a fan for wondering how a team could perform so poorly at such a shot, and for suspecting that if a team struggles at that shot, it’s a sure sign that they must struggle in general. At least that’s what I assume the emotions were when my colleague John T. (if you know a John T. that works at Minitab, that’s him.  We only have one.) sent the following email to me soon after the game:

I have a blog idea - is there a correlation between free throw percentage and winning percentage in college basketball?  Thought March might be a good time for a post like this.

I am constantly amazed by how bad Penn State is at free throws.  I think it is a basic skill that any basketball player at a collegiate level should have.

He adds “They call them free for a reason.”

Now, before answering John’s question, I do want to reiterate that this was sent after Penn State’s 13-of-22 night—their worst free-throw performance in almost two months. I’m just saying.

Back to the email.  The problem it raises has three layers, which I will address using Minitab Statistical Software:

  1. Is Penn State really bad at free throws?
  2. Is there a relationship between free throw percentage and winning percentage?
  3. If we account for a team’s strength of schedule (based on RPI), do we see a relationship between free throw percentage and winning percentage?

The first question posed is quite easy to answer.  Here is the distribution of free throw percentages at this point in the season, with Penn State shown in blue:

Individual Value Plot

Penn State is pretty much in the middle, with 68.9%.  Ohio State is not highlighted but is a little lower at 68.4%.  (Again, I’m just saying.)

Regression Analysis of Winning Percentage and Free Throw Percentage

The next question is a little more involved since we have to do regression, but it's nothing too scary.  I ran a Fitted Line Plot with winning percentage as the response and free throw percentage as the predictor, and included terms through cubic, all of which were statistically significant. Here is the plot, again with Penn State highlighted in blue:

Fitted Line Plot

The analysis shows that there is actually a statistically significant cubic relationship, though the R-Sq and R-Sq(adj) are quite low.  But the simplest answer to John’s question is that yes, there is a relationship.

Sequential Sum of Squares

The third question I posed is important to consider because strength of schedule is a very important factor in determining winning percentage—and because it gives me an excuse to demonstrate functionality in Minitab that many don’t give much thought to: sequential sum of squares.  If we were to just put strength of schedule (SOS) and free throw percentage into Regression, each would be given equal weight and the order in which we enter terms is not particularly important.  However, by choosing to use sequential sum of squares, the first term listed is added to the model and sum of squares is calculated as usual.  When another term is added, only the additional sum of squares that term explained is credited to it.

Let me explain a little more. Suppose you have two predictors that have some correlation—X1 and X2.  You create a model with only X1, which is significant and results in an R-Sq of 50%.  When you add X2, both predictors are significant and the R-Sq increases to 55%.  Using adjusted sum of squares (the default), both terms may have similar sum of squares values and be significant, because the variation in the response can be attributed to either factor and the amount attributed will be based strictly on fit. 

But using sequential sum of squares, X2 will have a much smaller value and may not even be significant because it only accounted for a fraction of additional sum of squares attributed to the model relative to what was already there (hence only a small increase in R-Sq). So you can interpret the significance of X2 as whether it has a relationship to the response after X1 has been accounted for.

Returning to basketball, I first enter SOS and SOS*SOS in the model, followed by free throw percentage (I included squares and cubic terms at first but removed them as they were not significant).  And here is the result:

Output

The simple answer to the third question is that even accounting for strength of schedule, free throw percentage is related to winning percentage, and in the direction one might guess: higher free throw percentages correspond to higher winning percentages.

Fits vs. Residuals

So, if free throw percentage is related to winning percentage and Penn State is actually fairly average in that regard, why is John so frustrated?  To answer that, we simply need to look at the fits versus residuals:

Residuals versus Fits

Based on strength of schedule and free throw percentage, we would expect Penn State to have about a 63% winning percentage...but instead they stand at only 40%. 

I’m just saying.

A Story-based Approach to Learning Statistics (and Statistical Software)

$
0
0

story-based approach to statisticsWant to learn more about analyzing data? Try taking a page from Aesop's book. 

Well...really, I'm suggesting taking multiple pages from Minitab's book, but my suggestion stems from an idea that Aesop epitomizes.  

Aesop was no fool. When he wanted to convey even the heaviest of lessons, he didn't waste time detailing the intellectual and philosophical arguments behind them. He didn't argue, cajole, or berate. He didn't lecture or pontificate. 

He told a story. 

Minitab uses the same approach in Meet Minitab, the introductory guide to data analysis and quality statistics using our statistical software. From beginning to end, Meet Minitab follows the story of a workflow quality improvement project that takes place at a (fictional) online bookstore.  

I went through the exercises in the book again recently, and was reminded of how well it illustrates the real-world applicability of data analysis. 

The Hands-on Story of a Quality Improvement Project
If you're going to be responsible for analyzing data as part of a Six Sigma project, working through Meet Minitab is a good way to get hands-on experience with the types of data you're likely to be working with.
 
In the guide, the online book retail company you work for has three regional shipping centers that distribute orders to consumers. Each shipping center uses a different computer system to enter and process order information. To integrate all orders and use the most efficient method company wide, the company wants to use the same computer system at all three shipping centers.
 
As you work through the examples in the book, you analyze data from the shipping centers as you learn to use Minitab. (All of the data sets are included with Minitab, so it's easy to follow along.)  
 
You create graphs and conduct statistical analyses to determine which computer system is the most efficient and results in the shortest delivery time. 
 
After you identify the most efficient computer system, you focus on the data from this center. First, you create control charts to see whether the center’s shipping process is in control. Then, you conduct a capability analysis to see whether the process is operating within specification limits. Finally, you conduct a designed experiment to further improve the shipping center’s processes.
 
Additionally, you learn about session commands, generating a report, preparing a worksheet, and customizing Minitab.
 
By the way, if you're more comfortable learning in a language other than English, Meet Minitab is also available in Arabic, Simplified Chinese, Dutch, Fench, German, Japanese, Korean, Polish, Portuguese, Russian and Spanish. 
An Relatable Way to Learn to Analyze Data 
I always had trouble relating to the examples of dice and playing cards used so often in statistics classes. I don't gamble, so I was constantly wondering where I'd ever apply what I was learning. So when I did the exercises that make up the story of the quality project in this guide, I found the emphasis on this real-world situation refreshing and engaging. 
 
I also really appreciated the chance to follow the project through from start to finish; unlike textbooks where each type of problem is a discrete, self-contained entity, the tasks and analyses in Meet Minitab build on and logically follow one another, so you can see why you should do certain tasks in a certain order.  
Xbar-S Chart
Of course, Meet Minitab is still a software user's guide. It's not Aesop's Fables. So I can't guarantee that, after going through it, you'll recall X-bar S charts as readily as you do the story of The Fox and the Grapes. Or that you'll understand Design of Experiments with the clarity Aesop brings to the moral of The Tortoise and the Hare
 
But after working through the book again myself, I'm confident you'll get a solid understanding of the most frequently used functions in Minitab Statistical Software, and how they can help with real-world quality improvement projects.  
 

Cost of Quality: Well, hello, my old, complex friend….

$
0
0

As I sat down to examine Cost of Quality (COQ) at Minitab, I flashed back to my CQE exam almost 20 years ago. I can still vividly remember staring down at a particularly difficult Cost of Quality question and wondering why I didn’t just follow my 4th-grade career assessment and become a novelist.  

Mental note: a study of the correlation between 4th grade career assessments and actual career paths would make for an interesting blog post.

I briefly considered fleeing the building, buying a bottle of absinthe and channeling Hemingway.  But then I remembered that, even with absinthe, I was no Hemingway and was probably ill-equipped to produce the next great novel.  So, alas, I stayed, answered the COQ questions, and here I am 20 years later: assessing the cost of software quality.

Okay, the absinthe part isn't true. Everyone knows that the Alcohol and Tobacco Tax and Trade Bureau didn't lift the absinthe ban until 2007.

Though it is a bit complex to calculate, Cost of Quality is a very worthwhile evaluation. The study of COQ, and process improvements surrounding it, can help to reduce costs while improving outgoing quality levels.

Components of Cost of Quality (COQ)

Cost of QualityThe Cost of Quality has two main components: the cost of good quality (the cost of conformance) and the cost of poor quality (the cost of non-conformance).

The cost of poor quality consists of those expenses surrounding the failure to meet customer requirements. These include internal failure costs such as re-designing, reworking due to design changes or defects (also called “bugs” in the software world), and the costs of re-testing.  It also includes external failure costs like lost sales, technical support calls, and processing customer complaints.

The costs of good quality are the costs associated with preventing and appraising the quality of the product. Appraisal costs include such things as testing throughout development and product reviews. Prevention costs include things like quality improvement activities, education, and failure prevention analysis (example: FMEA analysis).

Impacts on Cost of Quality

Studies have shown that more than half of software defects are introduced in the design and requirements phase (versus the coding phase). Most people in the software business have probably seen a variation of the "Defect Cost by Development Phase" graph outlining the relative costs to fix software defects. If you haven't, here's the bottom line: the cost to fix an error gets exponentially worse as time goes on. It is estimated to be 100 times more expensive to fix an error in the maintenance phase than in the design phase. This study holds true at Minitab, as well.  The earlier we detect defects, the better. Nobody likes redesigns. It’s that simple.

Defect Cost by PhaseIn addition, a study has shown that a typical company spends 5 -10% of quality costs on prevention, 20 - 25% on appraisal and the remaining 65 - 75% on internal and external failure costs. 

Managing the Cost of Quality

One view of COQ suggests that we should shift our costs from rework/failure to appraisal and prevention activities. While it’s clearly better to catch stuff the first time around, I’m not necessarily a fan of the “more is better” philosophy.  Whether it is more code reviews, more design reviews, etc., in my opinion, it’s not as simple as cost shifting.

That’s why we quality engineers get paid the big bucks, right?

So, when Minitab assesses cost of quality, it’s part of a broader continuous improvement effort. We definitely want to shift our focus to earlier in the process, but  not necessarily by adding more inspection to the front end.

Our goal is to prevent issues and, if they occur, to detect them as soon as possible. We use various techniques to do that, including:

  • Get our customers involved. Early. Earlier than early.  
  • Plan. Plan. Plan. We are thoughtful about how we plan feature development throughout a release cycle. If we are trying something that might have an impact across the application, we want it done as early as possible. The feature will then benefit from the development and testing of every other feature implemented after that.
  • Test Early. Test Often. We focus on validation as early as design. Ensuring that requirements are met is done initially and continually.
  • Ownership. We all own the quality of the product at Minitab. Don’t deliver unclear or incomplete designs. Don’t deliver buggy code. Don’t deliver buggy software.

In the end, maybe calculating COQ isn't as glamorous as writing the next great novel. I doubt my quotes are going to appear on Pinterest.  (I’ll keep looking, but I doubt it.)  I'm just an engineer who is passionate about customer focus, quality and efficiency, but that's glamorous enough for me!

Now, where's that absinthe?

 

 

 


Making Statistics Sweet on Valentine's Day

$
0
0

Valentine's M&M'sPlanning on giving a bag of M&M's to your sweetie this Valentine's Day? Well, you can woo your Valentine with not only the gift of candy, but also the statistics behind those candy-coated chocolate pieces.

Are there equal amounts of each color in a bag?

You can record your counts of each color in the bag in a Minitab worksheet, and then use a pie chart (Graph > Pie Chart) to visualize the counts:

Minitab Worksheet       

Minitab Pie Chart

There were 138 blue M&M’s and only 63 red M&M’s in our sample. But is the difference between these counts statistically significant? A Chi-square test can tell us:

Minitab Chi-Square Test

The p-value of 0.000 suggests that the observed counts are significantly different than what we would expect to see if there were an equal number of red, orange, yellow, green, blue and brown M&M’s. To perform your own Chi-Square test in Minitab, use Stat > Tables > Chi-Square Goodness-of-Fit Test (One Variable).

This example was done with the typical multi-color bag of M&M’s, but it could certainly be done with your bag of festive white, pink, and red M&M’s for V-Day!

Do enough M&M’s have the “m”?

M&M’s are easily identified by the signature “m” printed on each piece of candy. It must pose a quite challenge to stamp the familiar symbol on a surface as uneven as a peanut M&M. It’s not surprising to see that sometimes this “m” is not perfectly printed.

Suppose there is a requirement that no more than 15% of M&M’s have a misprinted “m.” If we count the total number of M&M’s and the number with misprints, we can conduct a 1 proportion test (Stat > Basic Statistics > 1 Proportion):

Minitab 1-Proportion Test

Of the 622 M&M’s we evaluated (what can I say – we really like M&M’s!), 87 had misprints. Using a 1 proportion test and an alternative hypothesis of greater than 15%, we get at p-value of 0.776. Because the p-value is greater than an α equal to 0.05, we can conclude that the proportion of misprinted M&M’s is 15% or less.

What else can you find out about your bag of M&Ms?

There are many other analyses you could perform on your M&M’s! Let us know in the comments what else you find out about your candy.

For more, check out this special Valentine’s trick to impress your favorite quality engineer: http://blog.minitab.com/blog/real-world-quality-improvement/valentines-day-statistics and this teaching resource that explores more ways to learn statistics with M&M’s.

Performing DOE for Defect Reduction

$
0
0

DOE Menu in MinitabLean Six Sigma and process excellence leaders are often asked to “remove defects” from products and processes. This can be quite a challenge! Lou Johnson, senior Minitab technical trainer and mentor, has some tips that might help if you’re faced with this situation. I had the chance to talk with Lou, and here’s what he shared with me about how to first approach a DOE.

How to Approach a DOE

Before jumping into a Design of Experiment (DOE) for defect reduction, Lou suggests stepping back and thinking first about what issue is likely causing the problem. If you need help thinking about what might be causing the defects, here is a list of common problems:

  • High common cause variation
  • A noise variable
  • Lack of basic process understanding
  • A one-time process change
  • Out-of-control process conditions
  • A single off-target process step

After pinpointing the likely cause, it will be easier to design your experiment accordingly. But what if youare already immersed in your DOE? For tips from Lou to help you along the way, check out these past posts that outline how to keep your DOE from turning D.O.A.:

Four Tips for Making Sure Your DOE isn’t D.O.A.

Four More Tips for Making Sure Your DOE isn’t D.O.A.

Upcoming Sessions at the Lean and Six Sigma World Conference

If you’re headed to the upcoming Lean and Six Sigma World Conference in San Diego, be sure to catch Lou’s session, “Design of Experiment for Defect Reduction” on February 22 at 8:45 a.m. He’ll talk more about his approach for DOE and illustrate it with several real-world case studies.

Fellow Minitab Blogger Joel Smith will highlight Measurement Systems Analysis (MSA) with examples from Olympic judging in his session at the same conference, which will take place on February 22 at 8:10 a.m. Check the conference program for room assignments: http://www.leanandsixsigma.org/

Of possible interest:

Olympic Judging: Fair or Biased?

Is This the Craziest College Basketball Season Ever?

$
0
0

The last few weeks have been pretty crazy in college basketball. In the first 13 days of February, nine different teams ranked in the Top 10 have lost. And had Duke not squeaked by Boston College last Sunday, it would have been the first time since 1992 that every team ranked in the AP Top 5 had lost in a single week. 

All of this has led to analysts saying that the parity in college basketball is greater than it’s ever been. And while it might seem that way, it’s always best to perform a data analysis to confirm whether your claims are true. Have there really been more Top 10 upsets this year than in previous years? Let’s turn to the data to answer our question!

First, I’ll use Minitab to tally the number of Top 10 upsets that have occurred each year over the last 7 seasons. I used the same time period for each year, Feb. 1 to Feb. 13. There were a few occurrences where two teams ranked in the Top 10 played each other. In these cases, I only counted it as an upset if the team that lost was ranked higher and was playing at home. You can download a Minitab worksheet with all the data in it here.

Upsets per year

We see that having 9 upsets in the first two weeks of February isn’t all that crazy. Sure, it’s the most since 2007, but every other year has only 2 or 3 fewer upsets. So this year really isn’t so different than previous seasons. Also, keep in mind the time period is biased towards this season since I specifically picked it because there have been a high number of upsets.

But let’s not stop there! All of the upsets this year came to unranked teams. Can that be said of any other year?

Losses to ranked and unranked teams

The statistics start looking a little crazier here, as no other year has more Top 10 teams losing to unranked teams than 2013. However, each year does have at least half of their upsets coming to unranked teams. What's more, we’re missing some information here. Specifically, did the team that got upset play on the road, or at home?

Losses at home

We see that 8 of the 9 upsets that have happened this month have been when the Top 10 team was playing on the road. Compare that to 2012, when a Top 10 team was upset at home 6 times in the beginning of February!  Because of home court advantage, I’d argue that six Top 10 teams losing at home in a two-week span is crazier than eight Top 10 teams losing on the road. When you add in the fact that every year since 2007 has had more Top 10 teams lose at home than this year, 2013 really doesn’t appear to be crazier.

Let’s look at one more thing. This year, 6 of the 9 upsets have occurred to teams in the Top 5 (not just the Top 10). So let’s look at the average rank of the teams that got upset for each year:

Average rank

This season, the upsets have involved higher-ranked teams than in previous years. But the difference isn’t very big. And we’ve already shown that most of those upsets have found the Top 10 team playing on the road, which lessens the craziness.

So what is all the fuss about then? I think the reason this year has gotten so much attention is the way the upsets have happened. Layups at the buzzer, half court shots to send the game to overtime, and the 5th-ranked team getting beat by arguably the worst team in a BCS conference. It’s true that upsets happen every year, but they don’t always have the dramatics that we’ve seen recently.

So yes, there have been a lot of upsets recently, but don’t let that fool you into thinking this season is any different from the recent past. Because when you look at the data, you’ll realize it’s like this every basketball season!

Violations of the Assumptions for Linear Regression: Closing Arguments and Verdict

$
0
0

 Lionel Loosefit has been hauled to court for violating the assumptions of regression analysis. On the last day of the trial, the prosecution and defense present their closing arguments. And the fate of Mr. Loosefit is decided by judge and jury...

The Prosecution's Summary

Prosecutor: Ladies and gentlemen, we’ve presented a slew of evidence in this trial. You’ve seen, with your own eyes, every possible heinous violation of the assumptions for regression in the defendant’s model. Here’s what we’ve shown, in a nutshell:

nutshell

Prosecutor: We’ve carefully delineated each violation with specific graphic evidence on Days 1, 2, and 3 of the trial. The evidence is so overwhelming, you might have trouble keeping it straight. Luckily, there’s a quick, easy way to review the cumulative evidence and reach a verdict. When you run regression analysis in Minitab's statistical software, simply click Graphs and select Four in One.

Had the defendant taken this simple precaution, he’d have gotten this:

fourpack

 

Prosecutor: Had he done so, Mr. Loosefit might have then tried to transform his data to remedy some of these problems. In fact, the General Regression command in Minitab comes with a built-in power transformation expressly for that purpose:

 box cox

Prosecutor: Of course, the defendant didn’t bother to try that, either. And he had many other opportunities to amend the error of his ways. For example, using Regression > Fitted Line Plot, he could have changed his model from linear to quadratic to better account for the curvature in his data.

linear vs quadratic

Prosecutor: What's more, he could have right-clicked a graph and chose Brush to identify and investigate outliers in his data.

Brushing

Prosecutor: Of course, Mr. Loosefit didn’t bother to do those things either. Was he unsure how to interpret the plots? Does he suffer from post-traumatic statistics disorder (PTSD)? The defense might have you believe that. But it won’t hold water. Because Mr. Loosefit had only one X variable and one Y variable. So he could have easily run his regression analysis in the Minitab Assistant (Assistant > Regression) and obtained a clear, user-friendly diagnostic report showing the problems in his model:        

Assistant report

Prosecutor: But he didn't do that either. In short, Minitab gave Lionel Loosefit all the chances in the world. Why did he not avail himself of any of these opportunities to take the high road? To follow the basic tenets of statistical decency?  

[Spectators shake heads sadly.]

Prosecutor: I’ll tell you why, ladies and gentlemen. Because Lionel Loosefit has absolutely no residual shame!!!!

[Courtroom erupts]

Judge: Order! Order in the court!

Prosecutor: And for that reason alone, you must find him guilty on all counts!

The Defense's Summary

Brain saltsDefense: The prosecution makes it sound so easy doesn’t it? Just choose Graphs > Four in one, they say. Just transform the data, they say. But we all know real life doesn’t always work out quite so neatly. And as he grappled with the complex statistical requirements for a regression analysis, not realizing all the options that existed in Minitab to help him, working under deadlines, Lionel Loosefit did what any one us would do. He reached for help in a bottle…

[Spectators shake heads sadly]

Defense: Effervescent brain salt. Lured by the promise of an instant cure for his statistics-induced brain troubles, Mr. Loosefit chugged the entire bottle. Unfortunately, he was completely unaware of the possible side-effects. Allow me to read those, starting on page 234, line 3487 [puts bottle insert under an electron microscope]:

"In susceptible individuals, effervescent brain saltmay cause tennis elbow, pink eye, tapeworm, dry heaves, church laughter…

[2 hours later]

hammertoes, bubonic plague, sudden bouts of belly-dancing, foot-in-mouth disease, triple vision, and short-term paralysis of the index finger."

Did you catch that last one? Short-term paralysis of the index finger, ladies and gentlemen. Making the defendant temporarily unable to click Graphs > Four in one, or another other options to evaluate regression assumptions in Minitab.  

Spectator [whispers]: Is that all they got for a defense? Salt on the brain?

Defense: No! We’ve got an even more compelling defense, for all you whispering spectators out there. Remember, as the prosecution itself demonstrated on Day 3, the date and time that Mr. Loosefit performed his regression analysis was duly recorded in the Minitab Session window:

SEssion window final

Think back, everyone. Almost exactly one year ago to the day. February 19, 2012. The Knicks vs the Mavericks. Madison Square Garden. Game time started at 1:00 PM ET.

Prosecutor: Objection, your honor. This is highly irrelevant.

Judge: Sustained. Counsel, get to the point.

Defense: Your Honor, like many of us, Mr Loosefit has two monitors at his work station. On one monitor he was performing a regression analysis using Minitab. On the other monitor, he was watching Jeremy Lin score 28 points and tally a career high of 14 assists and 5 steals.  

Prosecutor: Surely you’re not suggesting…

Defense: Absolutely! The defendant is not guilty by reason of temporary Linsanity! Caught up in the delirium of the game, he forgot to display residual plots. It could have happened to any of us. Thankfully, the sudden bout of Linsanity ended as quickly as it started. Today, the defendant does not present the slightest [yawn] danger to statistical…[yawn] socie…..zzzzzzz….[snores]

Judge: It appears that the defense rests.

The Verdict

Judge [to jury]: Have you reached a verdict?

Jury: We have, your honor. We find the defendant guilty on all charges.

Judge: Mr. Loosefit, I’m not 95% confident that we can accurately predict your future responses. That’s why I want to make sure that you don’t have even one degree of freedom to estimate a model, ever again.

[Courtroom applause]

Judge: I hereby sentence you to 30 years of hard labor, calculating all the statistics for each regression analysis yourself…without using Minitab. You’ll be calculating coefficients and plotting residuals for the remainder of your days, Loosefit. And to make doubly sure you learn your lesson, I’m denying you access to the calculators in Calc > Calculator or even the Tools > Calculator. All your calculations must be done by hand, including long division!

Generation X spectator: “Long division”? What’s that?! 

Generation Y spectator: Sounds like some kind of sadistic medieval punishment.

Generation Z spectator: Didn’t they outlaw that at the Geneva Convention?

Judge: But I'm not without mercy. I won't make you search for every formula in a statistics textbook. You'll be allowed to use Help > Methods and Formulas > Statistics > Regression. You'll  find all the formulas you need there.  

Loosefit: No! NO!! Not long division! Okay, I confess. But it was my p-values that made me do it. I wanted to feel statistically significant!

Judge: Statistical significance doesn't mean jack diddly if your model assumptions aren't met. Bailiff, take Mr. Loosefit away.

Update: Since his sentencing, Lionel Loosefit has undergone a power transformation in prison. He now works for the benefit of the public good, making sure that others remember to use Minitab to check their assumptions whenever they perform a regression analysis.

 

3 Common (and Dangerous!) Statistical Misconceptions

$
0
0

, Have you ever been a victim of a statistical misconception that’s affected how you’ve interpreted your analysis? Like any field of study, statistics has some common misconceptions that can trip up even experienced statisticians. Here are a few common misconceptions to watch out for as you complete your analyses and interpret the results.

Mistake #1: Misinterpreting Overlapping Confidence Intervals

When comparing multiple means, statistical practitioners are sometimes advised to compare the results from confidence intervals and determine whether the intervals overlap. When 95% confidence intervals for the means of two independent populations don’t overlap, there will indeed be a statistically significant difference between the means (at the 0.05 level of significance). However, the opposite is not necessarily true. CI’s may overlap, yet there may be a statistically significant difference between the means.

Take this example:

CI Plot
 
Two 95% confidence intervals that overlap may be significantly different at the 95% confidence level.

What’s the significance of the t-test P-value? The P-value in this case is less than 0.05 (0.049 < 0.05), telling us that there is a statistical difference between the means, (yet the CIs overlap considerably).  

Mistake #2: Making Incorrect Inferences about the Population

With statistics, we can analyze a small sample to make inferences about the entire population. But there are a few situations where you should avoid making inferences about a population that the sample does not represent:

  • In capability analysis, data from a single day is sometimes inappropriately used to estimate the capability of the entire manufacturing process.
  • In acceptance sampling, samples from one section of the lot are selected for the entire analysis.
  • A common and severe case occurs in a reliability analysis when only the units that failed are included in an analysis and the population is all units produced.

To avoid these situations, define the population before sampling and take a sample that truly represents the population.

Mistake #3: Assuming Correlation = Causation

It’s sometimes overused, but “correlation does not imply causation” is a good reminder when you’re dealing with statistics. Correlation between two variables does not mean that one variable causes a change in the other, especially if correlation statistics are the only statistics you are using in your data analysis.

For example, data analysis has shown a strong positive correlation between shirt size and shoe size. As shirt size goes up, so does shoe size. Does this mean that wearing big shirts causes you to wear bigger shoes? Of course not! There could be other “hidden” factors at work here, such as height. (Tall people tend to wear bigger clothes and shoes.)

Take a look at this scatterplot that shows that HIV antibody false negative rates are correlated with patient age:

Scatterplot
 
Does this show that the HIV antibody test does not work as well on older patients? Well, maybe …

But you can’t just stop there and assume that just because patients are older, age is the factor that is causing them to receive a false negative test result (a false negative is when a patient tests negative on the test, but is confirmed to have the disease).

Dig a little deeper! Here you see that patient age and days elapsed between at-risk exposure and test are correlated:

Scatterplot
 
Older patients got tested faster … before the HIV antibodies were able to fully develop and show a positive test result.

Keep the idea that “correlation does not imply causation” in your mind when reading some of the many studies publicized in the media. Intentionally or not, the media frequently imply that a study has revealed some cause-and-effect relationship, even when the study's authors detail precisely the limitations of their research.

Attending the ASA Conference on Statistical Practice?

Interested in learning about other common statistical misconceptions? If you are attending the upcoming American Statistical Association Conference on Statistical Practice, you are in luck! Jim Colton, one of Minitab’s technical training specialists, will be presenting How to Explain Common Statistical Misconceptions to Non-Statisticians on February 22, 2013 starting at 10:45 a.m. in room Napoleon A I-3 at the Sheraton New Orleans Hotel.

Also, be sure to stop by the conference expo, February 22-23, and visit Minitab booth #16. We hope to see you there!

What are other statistical misconceptions that have tripped you up?

Where to find meteorites, the Pareto chart way

$
0
0

The smoke shows the Chelyabinsk meteorite's path.It’s an amazing thing when a mass of rock and iron streaks through space and enters Earth’s atmosphere. So naturally, the Chelyabinsk meteor has attracted a great deal of attention. We’re fascinated by the images and captivated by the stories. And, if you’re interested in statistical analysis, you start to wonder a little bit about meteorites.

The nice thing is that the Meteoritical Society has a large database with information about meteorites recovered on Earth. The database has over 50,000 records.

It’s particularly neat to see where people find meteorites with recoverable masses. A Pareto chart in Minitab is useful for this statistical analysis because it lets you show the most popular categories and combines the remaining ones into a single category. That way, the chart stays legible without your needing to do any special work in the data. There are, after all, 148 recovery countries and regions in the database—too many categories to show legibly on a bar chart. For example, here’s what a bar chart looks like with all 148 countries and regions:

The bar chart with 148 categories is illegible.

If you stare at it long enough, you can just about make out that France is in the lower half of the list. Just about everything else is guessing.

Here’s the same data on a Pareto chart that combines the bottom 5% of the data into one category:

The pareto chart shows where 95% of meteorites are found.

Now it’s easy to see that Antarctica dwarfs all the other locations when we’re looking at the number of meteorites recovered. Apparently, it’s not so much that more meteorites land in Antarctica, they’re just much easier to find there.

We can also look at the most popular locations by mass of meteorite fragments in the database. For this Pareto chart, I combined the categories after I reached 50% of the recovered mass:

The pareto chart shows where 50% of meteorites are found by mass.

Only 4 countries account for half of the recovered mass of meteorites in the database. The differences in the charts are interesting because they point out where huge meteorites have struck. For example, Namibia has 19 meteorites in the database. That count ties Namibia with Kazahkhstan and Mauritania for 37th place in terms of number. But Namibia is the second highest in terms of mass because it has a single meteorite in the database that’s incredibly massive—60 metric tons. Greenland is similar, with fragments from a single meteorite that sum up to a mass of 58.2 metric tons. All of the countries on this graph have a meteorite with a mass of more than 24 metric tons.

The journal Nature estimates that the Chelyabinsk meteorite is the largest object to hit Earth since 1908, when another meteorite struck Russia. In the database, that Tunguska meteorite has 13.4 recovered grams. That’s good enough to tie for number 32,841 in terms of mass. We don’t yet know how much of the Chelyabinsk meteorite we’ll find, but if the amount is similar to the Tunguska meteorite, we can’t expect a rush of genuine meteorite souvenirs popping up on eBay.

Want more on the Pareto chart? Read on to learn how you can use a Pareto chart to focus a project on process improvement.

The photo of the meteor trace is byAlexAlishevskikhand licensed for reuse under thisCreative Commons License.

Why Statistics Is Important

$
0
0

Normal distribution plot"There are three kinds of lies: lies, damned lies, and statistics."

I’m sure you’ve heard this most vile expression, which was popularized by Mark Twain among others. This dastardly phrase impugns the reputation of statistics. The implication is that statistics can bolster a weak argument, or that statistics can be used to prove anything.

I’ve had enough of this expression, and here’s the rebuttal! In fact, I’ll make the case that statistics is not the problem, but the solution!

Mistakes Can Happen

First, let’s stipulate that an unscrupulous person can intentionally manipulate the results to favor unwarranted conclusions. Further, honest analysts can make honest mistakes because statistics can be tricky.

However, that does not mean that the field of statistics is to blame!

An analogy is in order here. If a surgeon does not follow best practices, intentionally or not, we don’t blame the entire field of medicine. In fact, when a mistake happens, we call on medical experts to understand what went wrong and to fix it. The same should be true with statistics. If an analyst presents unreliable conclusions, there is no one better qualified than a statistician to identify the problem and fix it!

So, what is the field of statistics, and why is it so important?

The Field of Statistics

The field of statistics is the science of learning from data. Statisticians offer essential insight in determining which data and conclusions are trustworthy. Statisticians know how to solve scientific mysteries and how to avoid traps that can trip up investigators.

When statistical principles are correctly applied, statistical analyses tend to produce accurate results. What’s more, the analyses even account for real-world uncertainty in order to calculate the probability of being incorrect.

To produce conclusions that you can trust, statisticians must ensure that all stages of a study are correct. Statisticians know how to:

  • Design studies that can answer the question at hand
  • Collect trustworthy data
  • Analyze data appropriately and check assumptions
  • Draw reliable conclusions
The Many Ways to Produce Misleading Conclusions

Statisicians should be a study's guide through a minefield of potential pitfalls, any of which could produce misleading conclusions. The list below is but a small sample of these pitfalls.

Biased samples: A non-random sample can bias the results from the beginning. For example, if a study uses volunteers, the volunteers collectively may be different than non-volunteers in a way that affects the results.

Overgeneralization: The results from one population may not apply to another population. A study that involves one gender or age group, may not apply to other groups. Statistical inferences are always limited and you need to understand the limitations.

Causality: How do you know when X causes a change in Y? Statisticians require tight criteria in order to assume causality. However, people in general accept causality more easily. If A precedes B, and A is correlated with B, and you know several people who say that A affects B, most people would assume, incorrectly, that data show a causal connection. Statisticians know better!

Incorrect analysis choices: Is the model too simple or too complex? Does it adequately capture any curvature that is present? Are the predictors confounded or overly correlated? Do you need to transform your data? Are you analyzing the mean when the median may be a better measure? There are many ways you can perform analyses, but not all of them are correct.

Violation of the assumptions for an analysis: Most statistical analyses have assumptions. These assumptions are often requirements about the type of sample, the type of data, and how the data (or residuals) are distributed. If you perform an analysis without checking the assumptions, you cannot trust the results even if you’ve taken all the measures necessary to collect the data properly.

Data mining: Even if everything passes muster, an analyst can find significant results simply by looking at the dataset for too long. If a large number of tests are performed, a few will be significant by chance. Fastidious statisticians keep track of all the tests that are performed in order to put the results in the proper context.

Statistics to the Rescue

In short, there are many ways to screw up and produce misleading conclusions. Once again, you have to get all of the stages correct or you can’t trust the conclusions.

If you want to use data to learn how the world works, you must have this statistical knowledge in order to trust your data and your results. There’s just no way around it. Even if you are not performing the study, understanding statistical principles can help you assess the quality of other studies and the validity of their conclusions. Statistical knowledge can even help reduce your vulnerability to manipulative conclusions from projects that have an agenda.

The world today produces more data than ever before. This includes all branches of science, quality improvement, manufacturing, service industry, government, public health, and public policy among many other settings. There will be many analyses of these data. Some analyses are straight up for science and others are more partisan in nature. Are you ready? Will you know which conclusions to trust and which studies to doubt?

In addition to resources like this blog, Minitab offers an e-learning course called Quality Trainer that can help you learn statistical principles, particularly as they relate to quality improvement. If you'd like to learn more about analyzing data, it's a great investment at just $30 per month, or possibly less if your organization uses Minitab Statistical Software.


Lightsaber Capability Analysis: Is Our Process In Control?

$
0
0

light sabre capability analysisIn my last post, we talked about using statistical tools to identify the right distribution of our lightsaber manufacturing data. Now that we have our data in Minitab along with a specific distribution picked out, we can find out if we are dealing with an in-control process. If the process is not in control, the capability estimates will be incorrect. Thus, an extremely important (and often overlooked) aspect of Capability Analysis is to make sure our process in first in control. We can do this with a tool Minitab Statistical Software offers called the Capability Sixpack.TM

First, let’s go to Stat > Quality Tools > Capability Sixpack > Normal. We’ve gotten our spec limits from the Jedi Temple, and know that they require the lightsabers to be between 2.75 to 3.25 feet in length. We also know that we have a target length of 3 feet. Knowing this information, we can fill out the dialog as follows:

6packmenu

We also want to tell Minitab that our measurements have a target we are trying to hit. Click the ‘Options’ button and enter a target of 3. Click OK in each menu, and our Capability Sixpack will be displayed.

Let’s see whether our process is in-control. Let’s first take a look at the Xbar Chart. We want to see the points randomly distributed between the control limits. This implies a stable process. The R chart works the same way. Again, we are looking to see the points randomly distributed between the control limits.

Looking at both of our charts in the Capability Sixpack, they seem to fit the conditions of a stable process. This is what we’re looking for. If our process isn’t stable, our capability analysis results can’t be trusted.

 We should also compare points on the R chart with those on the Xbar chart to see if the points follow each other.  If the process is stable, they shouldn't. These points do not follow each other, which again implies a stable process.

6pack

We also want to look at the Last 10 Subgroups Graph. Here, we're looking for a random, horizontal scatter, with no apparent shifts or drifts in the data. Shifting and drifting indicates a process that is out of control. Looking at our graph, the scatter appears to be randomly spread out, with no trends sticking out anywhere. All of our graphs seem to indicate that we have an in-control, stable graph. This is exactly what we are looking for.

If you want the interpretation of the process capability statistics to be valid, your data should approximately follow a normal distribution, as I discussed in my last blog post. On the capability histogram, the data approximately follow a normal curve. In addition, the points on the normal probability plot approximately follow a straight line and fall within the 95% confidence interval. These patterns indicate that the data are normally distributed, and confirm what we found in the last post, which was that our data are indeed normal.

Taking into account that our data appear to be normal, and follow an in control, stable process, we are clear to go ahead and conduct a Capability Analysis whose results we can fully trust.

Next post, we'll perform the analysis and look in detail about what our capability analysis results tell us about lightsaber production, and whether we are meeting the standards that have been set for us by the Jedi Temple.

 

For Want of an FMEA, the Empire Fell

$
0
0

Don't worry about it, we'll be fine without an FMEA!by Matthew Barsalou, guest blogger

For want of a nail the shoe was lost,
For want of a shoe the horse was lost,
For want of a horse the rider was lost
For want of a rider the battle was lost
For want of a battle the kingdom was lost
And all for the want of a horseshoe nail. (Lowe, 1980, 50)

According to the old nursery rhyme, "For Want of a Nail," an entire kingdom was lost because of the lack of one nail for a horseshoe. The same could be said for the Galactic Empire in Star Wars. The Empire would not have fallen if the technicians who created the first Death Star had done a proper Failure Mode and Effects Analysis (FMEA).

A group of rebels in Star Wars, Episode IV: A New Hope stole the plans to the Death Star and found a critical weakness that lead to the destruction of the entire station. A simple thermal exhaust port was connected to a reactor in a way which permitted an explosion in the exhaust port to start a chain reaction that blew up the entire station. This weakness was known, but considered insignificant because the weakness could only be exploited by small space fighters and the exhaust port was protected by turbolasers and TIE fighters. It was thought that nothing could penetrate the defenses; however, a group of Rebel X-Wing fighters proved that this weakness could be exploited. One proton torpedo fired into the thermal exhaust port started a chain reaction that led to the station reactors and destroyed the entire battle station (Lucas, 1976).

Why the Death Star Needed an FMEA

The Death Star was designed by the engineer Bevil Lemelisk under the command of Grand Moff Wilhuff Tarkin; whose doctrine called for a heavily armed mobile battle station carrying more than 1,000,000 imperial personnel as well as over 7,000 TIE fighters and 11,000 land vehicles (Smith, 1991). It was constructed in orbit around the penal planet Despayre in the Horuz system of the Outer Rim Territories and was intended to be a key element of the Tarkin Doctrine for controlling the Empire. The current estimate for the cost of building of a Death Star is $850,000,000,000,000,000 (Rayfield, 2013).

Such an expensive, resource-consuming project should never be attempted without a design FMEA. The loss of the Death Star could have been prevented with just one properly filled-out FMEA during the design phase:

FMEA Example

The Galactic Empire's engineers frequently built redundancy into the systems on the Empire’s capital ships and space stations; unfortunately, the Death Star's systems were all connected to the main reactor to ensure that power would always be available for each individual system. This interconnectedness resulted in thermal exhaust ports that were directly connected to the main reactor.

The designers knew that an explosion in a thermal exhaust port could reach the main reactor and destroy the entire station, but they were overconfident and believed that limited prevention measures--such as turbolaser towers, shielding that could not prevent the penetration of small space fighters, and wings of TIE fighters--could protect the thermal exhaust ports (Smith, 1991). Such thinking is little different than discovering a design flaw that could lead to injury or death, but deciding to depend upon inspection to prevent anything bad from happening. Bevil Lemelisk could not have ignored this design flaw if he had created an FMEA.

Assigning Risk Priority Numbers to an FMEA

An FMEA can be done with a pencil and paper, although Minitab's Quality Companion process improvement software has a built-in FMEA form that automates calculations, and shares data with process maps and other forms you'll probably need for your project. 

An FMEA uses a Risk Priority Number (RPN) to determine when corrective actions must be taken. RPN numbers range from 1 to 1,000 and lower numbers are better. The RPN is determined by multiplying severity (S) by occurrence (O) and detection D.

RPN = S x O x D

Severity, occurrence and detection are each evaluated and assigned a number between 1 and 10, with lower numbers being better.

Failure Mode and Effects Analysis Example: Death Star Thermal Exhaust Ports

In the case of the Death Star's thermal exhaust ports, the failure mode would be an explosion in the exhaust port and the resulting effect would be a chain reaction that reaches the reactors. The severity would be rated as 10 because an explosion of the reactors would lead to the loss of the station as well as the loss of all the personnel on board. A 10 for severity is sufficient reason to look into a redesign so that a failure, no matter how improbable, does not result in injury or loss of life.

FMEA Failure Mode Severity Example

The potential cause of failure on the Death Star would be attack or sabotage; the designers did not consider this likely to happen, so occurrence is a 3. The main control measure was shielding that would only be effective against attack by large ships. This was rated as a 4 because the Empire believed these measures to be effective.

Potential Causes and Current Controls

The resulting RPN would be S x O x D =  10 x 3 x 4 = 120. An RPN of 120 should be sufficient reason to take actions, but even a lower RPN requires a corrective action due to the high rating for severity. The Death Star's RPN may even be too low due to the Empire's overconfidence in the current controls. Corrective actions are definitely needed. 

FMEA Risk Priority Number

Corrective actions are easier and cheaper to implement early in the design phase; particularly if the problem is detected before assembly is started. The original Death Star plans could have been modified with little effort before construction started. The shielding could have been improved to prevent any penetration and more importantly, the interlinks between the systems could have been removed so that a failure of one system, such a an explosion in the thermal exhaust port, does not destroy the entire Death Star. The RPN needs to be reevaluated after corrective actions are implemented and verified; the new Death Star RPN would be 5 x 3 x 2 = 30.

FMEA Revised Metrics

Of course, doing the FMEA would have had more important impacts than just achieving a low number on a piece of paper. Had this step been taken, the Empire could have continued to implement the Tarkin Doctrine, and the Universe would be a much different place today. 

Do You Need to Do an FMEA? 

A simple truth is demonstrated by the missing nail and the kingdom, as well as the lack of an FMEA and the Death Star:  when designing a new product, whether it is an oil rig, a kitchen appliance, or a Death Star, you'll avoid many future problems by performing an FMEA early in the design phase.

   

 

About the Guest Blogger: 
Matthew Barsalou is the quality manager at an automotive supplier in Germany since 2011, and previously worked as a contract quality engineer at Ford in Germany and Belgium. He is completing a master’s degree in industrial engineering at the Wilhelm Büchner Hochschule in Darmstadt, Germany, and is also working on a manuscript for a practical introductory book on statistics.
  

Would you like to publish a guest post on the Minitab Blog? Contact publicrelations@minitab.com

 

References

Lucas, George. Star Wars, Episode IV: A New Hope. New York: Del Rey, 1976. http://www.amazon.com/Star-Wars-Episode-IV-Hope/dp/0345341465/ref=sr_1_2?ie=UTF8&qid=1358180992&sr=8-2&keywords=Star+Wars%2C+Episode+IV%3A+A+New+Hope

 Opie, Iona and Opie, Peter. ed. Oxford Dictionary of Nursery Rhymes. Oxford, 1951, 324. Quoted in Lowe, E.J. “For Want of a Nail.” Analysis 40 (January 1980), 50-52. http://www.jstor.org/stable/3327327

Rayfield, Jillian. “White House Rejects 'Death Star' Petition.” Salon, January 13, 2013. Accessed 1anuary 14, 2013 from http://www.salon.com/2013/01/13/white_house_rejects_death_star_petition/

Smith, Bill. ed. Star Wars: Death Star Technical Companion. Honesdale, PA: West End Games, 1991. http://www.amazon.com/Star-Wars-Death-Technical-Companion/dp/0874311209/ref=sr_1_1?s=books&ie=UTF8&qid=1358181033&sr=1-1&keywords=Star+Wars%3A+Death+Star+Technical+Companion.

Forget Statistical Assumptions - Just Check the Requirements!

$
0
0

One of the most poorly understood concepts in the use of statistics is the idea of assumptions. You've probably encountered many of these assumptions, such as "data normality is an assumption of the 1-sample t-test."  But if you read that statement and believe normality is requirement of the 1-sample t-test, then you have missed a subtle and important characteristic of assumptions and need to read on...

An "assumption" is not necessarily a "requirement"!

To understand where this idea of assumptions come from, let's forget about statistics for a minute and imagine we sell bikes online.  We can't ship our bikes whole, so we ship each bike separated into the frame, handlebar, seat, and wheels and must provide assembly instructions. We of course want simple and effective instructions.
  

Bike Box

Now, we don't necessarily know which tools the recipient owns, and we also don't know if, perhaps, they already own wheels or their own seat and want to use those instead of what we shipped.  We also don't know if there will be one person assembling the bike alone, or two, or even more. And maybe it's a kid's bike and the kid wants to assemble it, or maybe an adult does. The recipient may have a bike stand, or they may just be assembling on the floor.  Also, some people may have bought training wheels if the bike is for their child. If so, the training wheels may need to be installed before the real wheels.

You can see how complicated our instructions will become if we try to accommodate all of the possible scenarios. Worse, the instructions may end up being LESS useful rather than more if they are too complicated to understand or too general to provide the best methods for installation.

Reasonable Assumptions

So to make sure our instructions are easy to use and effective, we decide to make some assumptions about the person using them.  We assume they own a Phillips head screwdriver and an adjustable wrench, and that they do not own a bike stand.  We assume they are only assembling the parts we send them and don't have their own.  We assume there will be one adult assembling the bike alone.  All of these are reasonable assumptions that should capture the most common scenarios, and by making these assumptions about the user we can make very simple and useful instructions.  For those meeting these assumptions, assembly is easy!

Complete bike

So to this point we've seen a process by which making some assumptions has greatly reduced the complexity (and increased the effectiveness) of a product. 

This is entirely consistent with statistical assumptions.  When a tool such as One-Way ANOVA was developed, it started with a basic problem: "How can one use data to tell if three or more groups have different means?" and by making some reasonable assumptions (data within each group are normal, groups have equal variances, etc.) a fairly simple test was formed.  Had those assumptions not been made, the test would be more complicated and likely less effective.

But let's go back to what I said above: an "assumption" is not necessarily a "requirement"!

When Is An Assumption Also a Requirement?

Considering our bike assembly instructions again, let's further examine a couple of the assumptions we made.  We assumed the bike would be assembled by one person alone and wrote the instructions accordingly (for example, by not including directions such as "Have someone else hold the wheel upright while you...").  What if the assumption was not true for one of our customers and there were two people ready to help one another assemble the bike?  In all likelihood, our instructions are still just as simple and effective and assembly may even be faster. In this case, the assumption is not a requirement at all--but making the assumption allowed us to make the best set of instructions possible! Even if the assembler doesn't meet this assumption, the instructions are still robust.

We also assumed the customer owned a Phillips head screwdriver. But what if a customer only owns a flathead screwdriver?  In that case, they likely cannot proceed with our instructions. This assumption is also a requirement.

We could similarly examine all of our assumptions after the fact to consider:

  1. Is the assumption a requirement?
  2. If it is not a requirement, is it robust to any other possible scenario?
  3. If it is not robust to any other scenario, under which scenarios is it robust?
Answering Questions about Statistical Assumptions

When we look at statistics, we must understand the same aspects of each assumption. For example, normality is an assumption of the 1-sample t-test.  But let's answer the three questions about this assumption:

  1. Is the assumption a requirement?
    No, the assumption is not a requirement (this has been demonstrated through multiple studies and simulations).
     
  2. If it is not a requirement, is it robust to any other possible scenario?
    No. It is robust, but not to every possible scenario.
     
  3. If it is not robust to any other scenario, under which scenarios is it robust?
    It is robust when the sample size is at least 20 for small-to-moderate deviations from normality and at least 40 for more extremely skewed distributions (see http://www.minitab.com/en-US/support/documentation/Answers/Assistant%20White%20Papers/1SampleT_MtbAsstMenuWhitePaper.pdf).

So the next time you're presenting results and someone asks if you checked all of your assumptions, feel free to say, "No, just the requirements!"

How to Use Value Stream Maps in Healthcare

$
0
0

Value Stream Map - example from manufacturing While value stream mapping, or VSM, is a key tool used in many Lean Six Sigma projects for manufacturing, it’s also widely used in healthcare.

Value stream mapping can help you map, visualize, and understand the flow of patients, materials (e.g., bags of screened blood or plasma), and information. The “value stream” is all of the actions required to complete a particular process, and the goal of VSM is to identify improvements that can be made to reduce waste (e.g., patient wait times).

How is VSM applied to healthcare?

When used within healthcare, one obvious application for VSM is mapping a patient’s path to treatment to improve service and minimize delays.

To accurately map a system, obtaining high-quality, reliable data about the flow of information and the time a patient spends at or between steps is key. Accurately timing process steps and using multi-departmental teams is essential to obtain a true picture of what’s going on.

To map a patient’s path to treatment, a current state map can be created in a VSM tool (we offer a powerful one in Quality Companion) to act as a baseline and to identify areas for improvement:

Current State Value Stream Map

In this example, the first step a patient takes is to visit his general physician (abbrev. “GP” above), and this is represented as a rectangular process shape in the VSM. The time the patient spends at this step can be broken down into value-added (“VA”) and non value-added (“NVA”) cycle times. VA is time the customer is willing to pay for: that is, the 20 minutes spent consulting with the GP. NVA is the time the customer is not willing to pay for, i.e., the 20 minutes spent in the waiting room before the appointment.

The dotted line arrow between process steps is called a push arrow. This shows that once a patient completes a step, they are “pushed” to the next step. This is inefficient, and a more efficient process can be designed by changing push steps to continuous flow or "pull" steps. The yellow triangles indicate the time a patient spends waiting for the next process. These steps are a non-value added action for the patient.

While VSM can certainly be done by-hand on paper, using computer-based tools like those in Quality Companion makes the process a lot easier. For example, Quality Companion automatically calculates and displays a timeline underneath the VSM, which adds up the total time to go through the entire system (aka “lead time”) and displays summary information.

By identifying all of the steps, you can start to map the whole process out, moving from left to right. Once you have mapped out the entire system, an ideal future state map can be created, and possibly a series of future states in between. These can identify areas for improvement, and once implemented, they can become the “new” current state map as part of an iterative quality improvement process.

How do you improve the current state map?

When looking for areas of improvement, try to focus on changes to improve the flow of patients through the process. Continuous flow is the ideal and moves patients through the system without them having to wait. However, continuous flow is not always possible, so instead other changes might be introduced—such as first-in first-out (FIFO).

Also be sure to take a look at the takt time, which can help you decipher the pace of customer demand. In this case, takt time can be interpreted as the number of patients that can be treated per unit of time. Quality Companion will calculate takt time automatically.

Once you have completed the current and future state maps, you can compare the two, quantify improvement opportunities, and look at how to implement the changes. In this example, the triage and sort/appointment steps might be combined so that fewer visits to the hospital were required by the patient and they receive treatment faster.

To see another example value stream mapping, check out this webcast that features a scenario from Quality Companion’s Getting Startedguide:

If you're headed to the upcoming Healthcare Systems Process Improvement Conference in New Orleans March 1-4, stop by booth 100 at the expo and we can talk more about how value stream mapping and Quality Companion can help you.

And for even more, these past blog posts share some helpful tips:

Five Guidelines You Need to Follow to Create an Effective Value Stream Map

Four More Tips for Making the Most of Value Stream Maps

Helping Beginners Learn about Process Variation using Miles Per Gallon

$
0
0

by Robb Richardson, guest bloggergas pump

One of the things that I love most about my job is that I get to help educate, coach, and develop others on topics such as continuous improvement and data analysis.

In that capacity, one of the most frequently seen challenges is that team members and managers want to react to every data point. Their intentions are noble – but doing so is almost always an unnecessary exercise since these variations are a normal part of how the process behaves.

I’ve used lots of different examples to illustrate this point, but few seemed to resonate deeply with them and get them to completely grasp the concept. That is, until I started to use something that we are all pretty familiar with: our car’s miles-per-gallon statistic.

Now, truth be told, I’m not exactly what you would call a “car guy.” I drive a ten-year-old Toyota Solara and simply follow the suggested maintenance directions. There is one other thing that I do – I use an app on my smart phone to track the gas mileage I get with each fill-up. With very few exceptions, I always fill up at the same gas station (and the same pump) and I also remove the nozzle when the pump first “clicks” that the tank is full. By taking these steps, I feel pretty comfortable that I’ve eliminated some of the basic items that could contribute to inconsistent measurement.

When working with others, I ask them if them if they get the same gas mileage every week. Their response is always a resounding “No.” At that point I show them my most recent miles-per-gallon control chart (shown below) from Minitab (Stat > Control Charts > Variable Charts for Individuals > Individuals) and ask them what they think. Most of the time, they say something along the lines of it looking like what they expected.

control chart for miles-per-gallon

From there, I point out those individual data points that are lower than the previous value, are below the center line, or meet both conditions. I then ask them if I should take my car to the local Toyota dealership to be checked out, get the oil changed, tires replaced, or some other expensive service performed. Naturally, they answer in the negative and point out that “it’s just part of the process”…and at that point I know they understand the concept on a very personal level and will be less likely to go chasing after data points that are “in control” and are simply “normal cause” items.

One last thing that’s worth mentioning: occasionally someone will ask me about the one data point that comes very close to crossing the Lower Control Limit (LCL). I tell them that, when something comes so close, it may be worthy of a very brief double-check, but not much more than that. In the case of my one data point in the chart above, it should be pointed out that my wife was driving my car for that timeframe when she and our daughter went to a resort over on the western side of Florida. They literally had five suitcases weighing about 200 pounds. Additionally, my wife has a “lead foot” and then, when she was off the interstate, drove through two hours of stop-and-go traffic. So, during that period of time, the car was driven under much different conditions than it normally would be…and if it had crossed the LCL, I simply would have recognized it as a “special cause.”

I led off this item by mentioning that the best part of my job is to educate, coach, and develop others. With Minitab, it is much easier to drive home the important topics of discussion.

  

About the Guest Blogger: 
Robb Richardson has been working in the continuous process improvement arena of the financial services industry since 1997. He's an ASQ Certified Manager of Quality/Organizational Excellence and a Certified Quality Auditor. 
  

Would you like to publish a guest post on the Minitab Blog? Contact publicrelations@minitab.com

 

 

Viewing all 828 articles
Browse latest View live


Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>