This is an era of massive data. A huge amount of data is being generated from the web, from customer relations records but also from sensors used in the manufacturing industry (semiconductor, pharmaceutical, petrochemical companies and many other industries).
Univariate Control charts
In the manufacturing industry, critical product characteristics get routinely collected to ensure that all products at every step of the process remain well within specifications. Dedicated univariate control charts are deployed to ensure that any drift gets detected as early as possible to avoid negative effects on the final product performance. Ideally, when a special cause gets identified, the equipment should be immediately stopped until the issue gets resolved.
Monitoring tool process parameters
In modern plants, many manufacturing tools are connected to IT networks so that tool process parameters can be collected and stored in real time (pressures, temperatures etc…). Unfortunately, this type of data is, very often, not continuously monitored, although we might expect process parameters to play an important role in terms of final product quality. When a quality incident occurs, data from these numerous upstream process parameters are sometimes retrieved from databases, to investigate ( A Posteriori ) why this incident took place in the first place.
A more efficient approach would be to monitor these process parameters in real time and try to understand how they affect complex manufacturing processes: which process parameters are really important which ones are not ? what are their best settings etc… ?
Multivariate control charts
Monitoring upstream tool parameters might lead to a huge increase in the number of control charts though. In this context, process engineers might benefit from using multivariate charts which would enable them to monitor up to 7 or 8 parameters together in a single chart. Rather than using equipment process parameter data in a fire fighting mode to investigate the causes of previous quality incidents, this approach would focus on long term improvements.
Multivariate control charts are based on squared standardized (generalized) multivariate distances from the general mean. In Minitab, the T² Hotelling method is used to generate multivariate charts.
An obvious advantage of using multivariate charts is that they enable one to minimize the total number of control charts, but there are some additional related benefits involved as well:
Analyzing process parameters jointly: Many process parameters are related to one another, for example, for a particular process step we might expect the pressure value to be large when temperature is high. Considering every process parameter separately is not necessarily a good option and might be misleading. Detecting any mismatch between parameter settings may be very useful.
In the graph below the Y1 and Y2 parameter values are correlated (high values for Y1 associated to high values for Y2) so that the red point in the lower right corner appears to be out-of-control (beyond the control ellipse) from a multivariate point of view. From a univariate perspective, this red point remains within the usual fluctuation bounds for both Y1 and Y2, though. This point clearly represents a mismatch between Y1 and Y2. The squared generalized multivariate distance from the red point to the scatterplot mean is unusually large.
Overall rate of false alarms : The probability of a false alarm with three sigma standard limits in a control chart is 0,27%. If 100 charts are monitored at the same time, the probability of a false alarm automatically increases to 27% (0.27% * 100).
However, when numerous variables are monitored simultaneously using a single multivariate chart, the overall / family rate of false alarms remains close to 0.27%.
3-D measurements : When three dimensional measurements of a product are performed, the amount of data can get pretty large to ensure that all dimensions (the X, Y and Z dimensions) of a 3-D object remain within specifications. If the product gets damaged in a particular area, this will usually impact more than one dimension so the three dimensions should not be considered separately from one another. If a multivariate charts simultaneously monitors deviations from the ideal planned X, Y, Z values, their combined effects will be taken into account.
A simple example :
Eight process parameters have been monitored using eight univariate Xbar control charts. No out of control has been detected (see below):
The eight control charts above, may be replaced by a single multivariate chart. The associated multivariate chart displayed below, monitors the eight variables simultaneously. Although no out of control point had been detected in the univariate charts, subgroup number 12 turns out to be out of control in the multivariate chart.
Interpretation : to investigate why an out-of-control point (subgroup 12) occurred in the multivariate chart, I used simple graphs (scatterplots) to analyze time trends. Note that as far as the X3, X4 and X5 parameters are involved, subgroup 12 is positioned far away from the other points.
Conclusion :
When Process parameters have no direct critical effect, a univariate dedicated chart is not necessarily required. Multivariate charts would enable one to routinely monitor many tool process parameters with fewer charts. The objective would be to better understand whether out of control points in a multivariate chart may be used to anticipate quality issues as far as the product characteristics are concerned.
To better control a process, we need to assess how upstream tool parameters affect the final product. Multivariate charts are also very useful to monitor 3-D measurements. Interpreting the reason for an out of control point in a multivariate chart, is a key aspect.