*" on fMRI maps represent brain regions or voxels where "statistically significant" levels of activation or correlation are thought to have occurred. The size and number of these blobs are somewhat arbitrarily chosen, however, involving tradeoffs between excluding*

**blobs***(saying an area activates when it does not) and accepting*

**false positives***(considering an area to be silent when it really does activate).*

**false negatives**The first level coloring decision is typically based on calculation of a so-called

*(e.g., a*

**test statistic***T-*,

*F*- or

*Z*-score) for each voxel or brain region from the fMRI data. Under the

*that no true activation has occurred, a*

**null hypothesis***can be determined, representing the probability that the calculated test statistic score or larger has occurred by chance. Whenever the*

**p-value***p-*value is less than an arbitrary preselected level of significance, we conclude the measurement is unlikely to have occurred by chance and classify the voxel as "activated/correlated".

*p*-value thresholds. As seen below, moving this slider bar toward more restrictive

*p*-values reveals successively smaller areas of activation. But which level is the

*correct*one?

*T*-scores ranging from 0 (upper left) to 4.5 (lower right), corresponding to two-sided

*p*-values between 1.000000 and 0.000007. Which one is correct? For simple eloquent cortex mapping an arbitrary selection in the intermediate range is often made by selecting a "visually pleasing" and/or "modest" amount of activity.

*p*-values of 0.05 are commonly used in "standard" scientific experiments, this threshold is inappropriate for fMRI studies where simultaneous statistical testing must be performed on 100,000 or more voxels. Setting a

*p*-value of 0.05 for a single voxel means that 5000 (100,000 x 0.05) of these voxels could appear falsely activated. This is an example of the so-called

*, a major issue that affects genetics testing as well. Methods for handling this including the*

**multiple comparisons problem***are described in the*

**Bonferroni correction****Advanced Discussion**.

*p*-value threshold, several other arbitrary decisions about blobs must be made. These include: 1) choosing the actual color palette including the range of

*p*-values for corresponding to different shades or hues; 2) whether to employ additional spatial smoothing (makes the map less noisy but smears data anatomically); 3) whether and how to perform

*(e.g., deciding to color a voxel only when a certain small number of immediate neighbors also appear to be activated to reduce false positives). Each of these decisions can significantly affect the appearance of the map.*

**clustering**### Advanced Discussion (show/hide)»

Perhaps the most widely known procedure to account for multiple comparison errors in standard statistics is the ** Bonferroni correction**. In its simplest form, the Bonferroni method merely divides the required Type I error level (

*α*) by the number of independent tests (

*N*) performed. Thus, one wishes to maintain an

*α*= 0.05 error level for 10 tests, the

*p*-value used would need to be set at 0.05/10 = 0.005. You can see that for an fMRI data set with

*N*=~100,000 voxels being tested, the required p-value would be on the order of 5 x 10

^{−7}, an extremely stringent requirement. Using such a strict criterion to avoid Type I errors would severely impact the power of the fMRI data analysis leading to an increasing number of false negative results (Type II errors). Accordingly, several Bonferroni variants (Holm, Hochberg, Simes) including step-wise sequential testing have been devised. An alternative and increasingly popular approach is to control the

**, the expected proportion of falsely rejected voxels.**

*false discovery rate (FDR)***References**

Colquhoun D. An investigation of the false discovery rate and the misinterpretation of

*p*-values. R Soc Open Sci 2014; 1:140216.

Cohen J. The earth is round (p < .05). Am Psychologist 1994; 49:997-1003.

Engel SA, Burton PC. Confidence intervals for fMRI activation maps. PLOS ONE 2013; 8:e82419. (Paper demonstrating some of the errors made by naive viewers in their interpretation of activation maps, including the false idea that it is possible to compare brain areas based on their map colors.)

Goodman S. A dirty dozen: twelve

*p*-value misconceptions. Semin Hematol 2008; 45:135-140.

Nichols TE. Multiple testing corrections, nonparametric methods, and random field theory. NeuroImage 2012; 62:811-815.

Nichols T, Hayasaka S. Controlling the familywise error rate in functional neuroimaging: a comparative review. Stat Methods Med Res 2003; 12:419-446.