Checking data quality is tedious, but even the most carefully done experiments usually include one or two bad chips. Quality assessment (QA) often makes a big difference to the results of a study. If you've invested effort in generating data, it's painful to think that some of the data may not be good. Most array data analysis programs don't give you much reason to doubt the quality of your arrays. Why open a can of worms? However if you try to dig up dirt on your chips, and try to find the problems up front, you may avoid digging a hole for your research by trying to interpret bad data.
Many researchers check RNA quality and dye incorporation before hybridizing the samples onto arrays. Between the time that a sample is taken, and the time the RNA is extracted and purified, enzymes in the cell rapidly degrade mRNA by cutting it into shorter pieces. Most of these shorter pieces will hybridize easily to several different probes; then the signals from many probes reflect abundances of several transcripts not just their targets. One way to detect degraded RNA is to examine the two most abundant types of RNA – the 18S and 28S ribosomal RNA's. If the ribosomal RNAs are mostly intact they form two sharp peaks as the total RNA is washed through a gel. This may be done also with a commercial tool such as the Agilent BioAnalyzer. A BioAnalyzer trace from good quality RNA is shown at left.
Since the signal from a probe depends on the amount of label in molecules attached to that probe, it makes sense to check how well the label is incorporated in the sample. In practice the amount of label in different samples varies, especially for the red Cy-5 dye. A commercial product to measure how much label is incorporated in the sample is the NanoDrop Probe. The amount of label may not be stable. Microarray technicians have observed that the Cy5 label seems to perform poorly in hot humid summers. Researchers from Agilent confirmed that even moderate levels (5ppb) of ozone can degrade Cy5 while not affecting Cy3. Such effects can dramatically change the ratios on two-color slides from winter to summer. Some labs near expressways have built 'clean rooms' where air is treated to prevent ozone and humidity entering.
In the early days most microarrays were printed by robotic pipettes from 96-well plates containing cDNA clones. This process rarely worked perfectly; it was common to see spots that were badly formed or to see fluorescent material spread over a large area. This is not so much a problem with modern commercial arrays. If you are working with spotted arrays, you might want to look at the Appendix on spot-level control.
If the sample RNA and the labelling pass the wet lab quality checks, then further information about the process of hybridization comes from the controls. There is no excuse for chips without a well–designed set of negative and positive control probes. Negative controls are probes designed for DNA sequences that should not be present in the sample. Positive controls are replicate probes for sequences that should occur; often these are in fact abundant. Both positive and negative controls should be distributed over the chip. Spike-in controls are probes that match transcripts that do not occur in the sample, but are added (in known amounts) to the samples before hybridization, or in some cases, before labelling. Most manufacturers have included a variety of negative controls; some include spike-in controls. Agilent includes some positive controls. Unfortunately Illumina's expression systems had poor controls up to 2010.
The signal from negative controls gives an idea of the background in all signals due to non-specific hybridization. Therefore in a good chip the negative controls should all report low signal, and this low value should be fairly uniform (i.e. it should not show any pronounced spatial pattern); however different negative control probes from different genes will typically have somewhat different means, because the probes have different intrinsic properties and thermodynamics. Generally you won't be able to estimate reliably the abundances of those genes whose signals are comparable to the signals from negative controls, even if the signals from those probes are above the local surrounding background (see Image Analysis).
Positive controls give some idea of the dynamic range spatial variation in hybridization. Probes for the same gene should show fairly uniform intensities across the chip. If the positive controls are very different from their average in some region, it is worth taking a closer look, and perhaps discarding all signals from that region. It is common to see spatial gradients in intensity, and sometimes in ratio. In the old days two-color slides were placed on lab counters during hybridization; lab counters are not precisely level, and sometimes slightly more of the sample is present at one end of the slide than the other. The differences are small and would seem not to matter, but the balance of processes is delicate and the consequence of such small differences are that one end of the chip is brighter than the other. Because of saturation issues sometimes the log ratios show a similar gradient.
Spike-in controls give some idea of the accuracy and linearity of the signals. Some transcripts are added in ratios of 3:1 or 10:1 to the two samples. Typically one sees that the ratios as reported are squashed, and sometimes that the low intensity spike-ins show different ratios than the majority of spike-ins. In the early days of microarrays it was fairly hard to control spike-in amounts precisely, and they rarely worked exactly as expected.
Recently NIST has put in considerable effort to standardize spike-in procedures and to ensure that a uniform subset of spike-ins is available on every array via their External RNA Controls Consortium. As of 2010 many manufacturers will start including these standard controls.
Data analysts often worry that differences in the measures, which they are analyzing, reflect some artifact of the measurement process, rather than true biological differences. This worry is often well-founded. To satisfiy themselves that this isn't true, statisticians like to plot their measures against known technical variables, which they think might affect the measures. Traditionally these are variables, such as the technician, or the date, or the batch of reagents. With microarrays no one wants to plot such pictures for thousands of genes. A simpler approach is to consider each sample and plot the measures against technical characteristics of the probes.
Residuals
In data from poorly functioning hybridization stations one often observes uneven signal and high background around the inlet ports; it seems the turbulent fluid affects the hybridization reaction. One should discard signals from the affected regions, and if this uneven pattern extends for a long way it is better to discard the chip.
If you are using a custom spotted cDNA array you may want to filter your spots individually. If you are using printed or synthesized arrays from a major manufacturer, this sectionwon't be relevant. Spot-level QC detects mostly printing problems rather than hybridization anomalies. Most image quantification programs flag spots that fail their internal QC measures; it is rarely a good idea to keep spots that have been flagged. You may want to do further QC of individual spots based on several other measures reported by the image processing program (GenePix and Quantarray give many). Some reported measures are often: spot area, uniformity (standard deviation of foreground), and background uniformity. It is not practical to examine thousands of spots individually; an automated filtering procedure is what is needed. However the filtering criteria that are useful for one experiment, are too slack or too strict for the next; there are no rules about spot size, or background that apply across the board to all chips under all circumstances.