Quantcast
Channel: Minitab | Minitab
Viewing all articles
Browse latest Browse all 828

Taking the F out of the WTF test

$
0
0

Many moons ago, I wrote a post entitled "So, Why Does the World Trust Minitab?" In that post, I alluded to upcoming improvements to one of our statistical tests. At that time, I could not give any details, because the project was shrouded in secrecy. But now that Minitab has released version 17 of our statistical software, the story of Bonnet's test can finally be told.

Suppose you want to compare the standard deviations of two samples. Previous releases of Minitab, including Release 16, gave you two handy tests, the F-test and Levene's test. The F-test is so-named because it usually fails you.1 In theory, the F-test is appropriate as long as your data are normally distributed. In practice, however, the F-test tends to be a bit on the emo side. That is, it is too sensitive (to departures from normality) to be useful for much.

Levene’s test was developed to help overcome this extreme sensitivity to nonnormality. Levene's test is sometimes called the W50 test. I sometimes call it the WTF test, for reasons that shall become apparent shortly. Take a look at the actual results below from a test of two standard deviations that I actually ran in Minitab 16 using actual data that I actually made up:

Ratio of the standard deviations in Release 16

The ratio of the standard deviations from samples 1 and 2 (s1/s2) is 1.414 / 1.575 = 0.898. This ratio is our best "point estimate" for the ratio of the standard deviations from populations 1 and 2 (Ps1/Ps2). The ratio is less than 1, which suggests that Ps2 is greater than Ps1. So far, so good.

Now, let's have a look at the confidence interval (CI) for the population ratio. The CI gives us a range of likely values for Ps1/Ps2. (The CI below labeled "Continuous" is the one calculated using Levene's method):

Confidence interval for the ratio in Release 16

Notice that the CI includes, .... er, doesn't include? .... shouldinclude !

What in Gauss' name is going on here ?!?  The range of likely values for Ps1/Ps2 (1.046 to 1.566) doesn't include the point estimate (0.898)?!?  In fact, the CI suggests that Ps1/Ps2 is greater than 1. Which suggests that Ps1 is actually greater than Ps2. But the point estimate suggests the exact opposite! Which suggests that I might be losing my mind. Or that something odd is going on here. Or both.

Well, I am happy to say that I am not losing my mind. (Not this time, at least.) One reason that Levene's method is robust to nonnormality is that Levene's method isn't actually based on the standard deviation. Instead, Levene’s method is based on a statistic called the mean absolute deviation from the median, or MADM. The MADM is much less affected by nonnormality and outliers than is the standard deviation. And even though the MADM and the standard deviation of a sample can be very different, the ratio of MADM1/MADM2 is nevertheless a good approximation for the ratio of Ps1/Ps2.

The only problem is that in extreme cases, outliers can affect the sample standard deviations so much that s1/s2 can fall completely outside of Levene's CI. Awwwwkwwwaaaaaard.

Fortunately, in Minitab 17, statisticians have made things considerably less awkward. (This may be the only time in your life that you'll hear the phrase "statisticians have made things considerably less awkward.") One of the brave-hearted folks in our R&D department toiled against all odds, and at considerable personal peril to solve this enigma. The result is an effective, elegant, and non-enigmatic test that we call Bonnet's test:

Confidence interval in Release 17

Like Levene's test, Bonnet's test can be used with nonnormal data. But unlike Levene's test, Bonnet's test is actually based on the actual standard deviations of the actual samples. And you know what that means. Yes, my friends: gone are the days of embarrassing discrepancies between the CI and the ratio—with Minitab 17, you can compare standard deviations with confidence! And as if that weren't enough, Minitab 17 also includes a handy dandy summary plot. All the information you need to interpret your test results, conveniently located right in front of your face.

Summary plot in Release 17

So what are you waiting for? Mesmerize your manager, confound your colleagues, and stun your stakeholders with Minitab 17. Operators are standing by. 

------------------------------------------------------------

 

 

1 So, that bit about the name of the F-test—I kind of made that up. Fortunately, there is a better source of information for the genuinely curious. Our white paper, Bonett's Method, includes all kinds of details about these tests and comparisons between the CIs calculated with each. Enjoy.

 
return to text of post

 

 


Viewing all articles
Browse latest Browse all 828

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>