Note

Go to the end to download the full example code.

Support recovery on fMRI data#

This example compares several methods that estimate a decoder map support with statistical guarantees. More precisely, we aim at thresholding the weights of some estimated decoder maps according to the confidence we have that they are nonzero. Here, we work with the Haxby dataset and we focus on the ‘face vs house’ contrast. Thus, we consider the labeled activation maps of a given subject and try produce a brain map that corresponds to the discriminative pattern that makes the decoding of the two conditions.

In this example, we show that standard statistical methods (i.e., method such as thresholding by permutation test the SVR or Ridge decoder or the algorithm introduced by Gaonkar et al. [1]) are not powerful when applied on the uncompressed problem (i.e., the orignal problem in which the activation maps are not reduced using compression techniques such as parcellation). This is notably due to the high dimensionality (too many voxels) and structure of the data (too much correlation between neighboring voxels). We also present two methods that offer statistical guarantees but with a (small) spatial tolerance on the shape of the support: clustered desparsified lasso (CLuDL) combines clustering (parcellation) and statistical inference ; ensemble of clustered desparsified lasso (EnCluDL) adds a randomization step over the choice of clustering.

EnCluDL is powerful and does not depend on a unique clustering choice. As shown in Chevalier et al. (2021) [2], for several tasks the estimated support (predictive regions) looks relevant.

References#

Imports needed for this script#

import numpy as np
import pandas as pd
from nilearn import datasets
from nilearn.image import mean_img
from nilearn.input_data import NiftiMasker
from nilearn.plotting import plot_stat_map, show
from sklearn.cluster import FeatureAgglomeration
from sklearn.feature_extraction import image
from sklearn.linear_model import Ridge
from sklearn.svm import LinearSVR
from sklearn.utils import Bunch

from hidimstat.adaptative_permutation_threshold_SVR import ada_svr
from hidimstat.clustered_inference import clustered_inference
from hidimstat.empirical_thresholding import empirical_thresholding
from hidimstat.ensemble_clustered_inference import ensemble_clustered_inference
from hidimstat.permutation_test import permutation_test, permutation_test_pval
from hidimstat.stat_tools import pval_from_scale, zscore_from_pval

n_job = None

Function to fetch and preprocess Haxby dataset#

def preprocess_haxby(subject=2, memory=None):
    """Gathering and preprocessing Haxby dataset for a given subject."""

    # Gathering data
    haxby_dataset = datasets.fetch_haxby(subjects=[subject])
    fmri_filename = haxby_dataset.func[0]

    behavioral = pd.read_csv(haxby_dataset.session_target[0], sep=" ")

    # conditions = pd.DataFrame.to_numpy(behavioral['labels'])
    conditions = behavioral["labels"].values
    session_label = behavioral["chunks"].values

    condition_mask = np.logical_or(conditions == "face", conditions == "house")
    groups = session_label[condition_mask]

    # Loading anatomical image (back-ground image)
    if haxby_dataset.anat[0] is None:
        bg_img = None
    else:
        bg_img = mean_img(haxby_dataset.anat)

    # Building target where '1' corresponds to 'face' and '-1' to 'house'
    y = np.asarray((conditions[condition_mask] == "face") * 2 - 1)

    # Loading mask
    mask_img = haxby_dataset.mask
    masker = NiftiMasker(
        mask_img=mask_img, standardize=True, smoothing_fwhm=None, memory=memory
    )

    # Computing masked data
    fmri_masked = masker.fit_transform(fmri_filename)
    X = np.asarray(fmri_masked)[condition_mask, :]

    return Bunch(X=X, y=y, groups=groups, bg_img=bg_img, masker=masker)

Gathering and preprocessing Haxby dataset for a given subject#

The preprocess_haxby function make the preprocessing of the Haxby dataset, it outputs the preprocessed activation maps for the two conditions ‘face’ or ‘house’ (contained in X), the conditions (in y), the session labels (in groups) and the mask (in masker). You may choose a subject in [1, 2, 3, 4, 5, 6]. By default subject=2.

data = preprocess_haxby(subject=2)
X, y, groups, masker = data.X, data.y, data.groups, data.masker
mask = masker.mask_img_.get_fdata().astype(bool)

[_add_readme_to_default_data_locations] Added README.md to /home/runner/nilearn_data
[get_dataset_dir] Dataset created in /home/runner/nilearn_data/haxby2001
[fetch_single_file] Downloading data from https://www.nitrc.org/frs/download.php/7868/mask.nii.gz ...
[fetch_single_file]  ...done. (0 seconds, 0 min)

[fetch_single_file] Downloading data from http://data.pymvpa.org/datasets/haxby2001/MD5SUMS ...
[fetch_single_file]  ...done. (0 seconds, 0 min)

[fetch_single_file] Downloading data from http://data.pymvpa.org/datasets/haxby2001/subj2-2010.01.14.tar.gz ...
[_chunk_report_]
Downloaded 19800064 of 291168628 bytes (6.8%%,   14.4s remaining)
[_chunk_report_]
Downloaded 61292544 of 291168628 bytes (21.1%%,    7.9s remaining)
[_chunk_report_]
Downloaded 101621760 of 291168628 bytes (34.9%%,    5.9s remaining)
[_chunk_report_]
Downloaded 143638528 of 291168628 bytes (49.3%%,    4.3s remaining)
[_chunk_report_]
Downloaded 180936704 of 291168628 bytes (62.1%%,    3.2s remaining)
[_chunk_report_]
Downloaded 223772672 of 291168628 bytes (76.9%%,    1.9s remaining)
[_chunk_report_]
Downloaded 266207232 of 291168628 bytes (91.4%%,    0.7s remaining)
[fetch_single_file]  ...done. (8 seconds, 0 min)

[uncompress_file] Extracting data from /home/runner/nilearn_data/haxby2001/9cabe068089e791ef0c5fe930fc20e30/subj2-2010.01.14.tar.gz...
[uncompress_file] .. done.

/home/runner/work/hidimstat/hidimstat/examples/plot_fmri_data_example.py:91: FutureWarning: From release 0.13.0 onwards, this function will, by default, copy the header of the input image to the output. Currently, the header is reset to the default Nifti1Header. To suppress this warning and use the new behavior, set `copy_header=True`.
  bg_img = mean_img(haxby_dataset.anat)
/opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/nilearn/image/resampling.py:489: UserWarning: The provided image has no sform in its header. Please check the provided file. Results may not be as expected.
  warnings.warn(

Initializing FeatureAgglomeration object that performs the clustering#

For fMRI data taking 500 clusters is generally a good default choice.

n_clusters = 500
# Deriving voxels connectivity.
shape = mask.shape
n_x, n_y, n_z = shape[0], shape[1], shape[2]
connectivity = image.grid_to_graph(n_x=n_x, n_y=n_y, n_z=n_z, mask=mask)
# Initializing FeatureAgglomeration object.
ward = FeatureAgglomeration(n_clusters=n_clusters, connectivity=connectivity)

Making the inference with several algorithms#

First, we try to recover the discriminative partern by computing p-values from SVR decoder weights and a parametric approximation of the distribution of these weights.

beta_hat, scale = empirical_thresholding(
    X,
    y,
    linear_estimator=LinearSVR(),
)
pval_std_svr, _, one_minus_pval_std_svr, _ = pval_from_scale(beta_hat, scale)

Now, we compute p-values thanks to permutation tests applied to 1/the weights of the SVR decoder or 2/the weights of the Ridge decoder.

# To derive the p-values from the SVR decoder, you may change the next line by
# `SVR_permutation_test_inference = True`. It should take around 15 minutes.

SVR_permutation_test_inference = False
if SVR_permutation_test_inference:
    # It will be better to associate cross validation with the estimator
    # but for a sake of time, this is not done.
    estimator = LinearSVR()
    weight_svr, weight_svr_distribution = permutation_test(
        X, y, estimator, n_permutations=50
    )
    pval_corr_svr_perm_test, one_minus_pval_corr_svr_perm_test = permutation_test_pval(
        weight_svr, weight_svr_distribution
    )

# Another method is to compute the p-values by permutation test from the
# Ridge decoder. The solution provided by this method should be very close to
# the previous one and the computation time is much shorter: around 20 seconds.
# We computed the parameter from a cross valisation (alpha = 0.0215)
# It will be better to use RidgeCV but for a sake of time, this is not done.
estimator = Ridge()
weight_ridge, weight_ridge_distribution = permutation_test(
    X, y, estimator=estimator, n_permutations=200
)
pval_corr_ridge_perm_test, one_minus_pval_corr_ridge_perm_test = permutation_test_pval(
    weight_ridge, weight_ridge_distribution
)

Now, let us run the algorithm introduced by Gaonkar et al. (c.f. References). Since the estimator they derive is obtained by approximating the hard margin SVM formulation, we referred to this method as “ada-SVR” which stands for “Adaptive Permutation Threshold SVR”. The function is ada_svr.

beta_hat, scale = ada_svr(X, y)
pval_ada_svr, _, one_minus_pval_ada_svr, _ = pval_from_scale(beta_hat, scale)

Now, the clustered inference algorithm which combines parcellation and high-dimensional inference (c.f. References).

beta_hat, pval_cdl, _, one_minus_pval_cdl, _ = clustered_inference(
    X, y, ward, n_clusters
)

Clustered inference: n_clusters = 500, inference method = desparsified-lasso, seed = 0
Clustered inference {}
hd_inference False {}
/opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/sklearn/linear_model/_coordinate_descent.py:681: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations. Duality gap: 1.7422256933791083, tolerance: 0.21600000000000003
  model = cd_fast.enet_coordinate_descent_gram(

Below, we run the ensemble clustered inference algorithm which adds a randomization step over the clustered inference algorithm (c.f. References). To make the example as short as possible we take n_bootstraps=5 which means that 5 different parcellations are considered and then 5 statistical maps are produced and aggregated into one. However you might benefit from clustering randomization taking n_bootstraps=25 or n_bootstraps=100, also we set n_jobs=2.

beta_hat, pval_ecdl, _, one_minus_pval_ecdl, _ = ensemble_clustered_inference(
    X, y, ward, n_clusters, groups=groups, n_bootstraps=5, n_jobs=2
)

[Parallel(n_jobs=2)]: Using backend LokyBackend with 2 concurrent workers.
[Parallel(n_jobs=2)]: Done   5 out of   5 | elapsed:   30.8s finished

Plotting the results#

To allow a better visualization of the disciminative pattern we will plot z-maps rather than p-value maps. Assuming Gaussian distribution of the estimators we can recover a z-score from a p-value by using the inverse survival function.

First, we set theoretical FWER target at 10%.

n_samples, n_features = X.shape
target_fwer = 0.1

We now translate the FWER target into a z-score target. For the permutation test methods we do not need any additional correction since the p-values are already adjusted for multiple testing.

zscore_threshold_corr = zscore_from_pval((target_fwer / 2))

Other methods need to be corrected. We consider the Bonferroni correction. For methods that do not reduce the feature space, the correction consists in dividing by the number of features.

correction = 1.0 / n_features
zscore_threshold_no_clust = zscore_from_pval((target_fwer / 2) * correction)

For methods that parcelates the brain into groups of voxels, the correction consists in dividing by the number of parcels (or clusters).

correction_clust = 1.0 / n_clusters
zscore_threshold_clust = zscore_from_pval((target_fwer / 2) * correction_clust)

Now, we can plot the thresholded z-score maps by translating the p-value maps estimated previously into z-score maps and using the suitable threshold. For a better readability, we make a small function called plot_map that wraps all these steps.

def plot_map(
    pval,
    one_minus_pval,
    zscore_threshold,
    title=None,
    cut_coords=[-25, -40, -5],
    masker=masker,
    bg_img=data.bg_img,
):

    zscore = zscore_from_pval(pval, one_minus_pval)
    zscore_img = masker.inverse_transform(zscore)
    plot_stat_map(
        zscore_img,
        threshold=zscore_threshold,
        bg_img=bg_img,
        dim=-1,
        cut_coords=cut_coords,
        title=title,
    )


plot_map(
    pval_std_svr,
    one_minus_pval_std_svr,
    zscore_threshold_no_clust,
    title="SVR parametric threshold",
)

if SVR_permutation_test_inference:
    plot_map(
        pval_corr_svr_perm_test,
        one_minus_pval_corr_svr_perm_test,
        zscore_threshold_corr,
        title="SVR permutation-test thresh.",
    )

plot_map(
    pval_corr_ridge_perm_test,
    one_minus_pval_corr_ridge_perm_test,
    zscore_threshold_corr,
    title="Ridge permutation-test thresh.",
)

plot_map(
    pval_ada_svr,
    one_minus_pval_ada_svr,
    zscore_threshold_no_clust,
    title="SVR adaptive perm. tresh.",
)

plot_map(pval_cdl, one_minus_pval_cdl, zscore_threshold_clust, "CluDL")

plot_map(pval_ecdl, one_minus_pval_ecdl, zscore_threshold_clust, "EnCluDL")

/opt/hostedtoolcache/Python/3.12.9/x64/lib/python3.12/site-packages/nilearn/plotting/displays/_slicers.py:313: UserWarning: empty mask
  ims = self._map_show(img, type="imshow", threshold=threshold, **kwargs)

Analysis of the results#

As advocated in introduction, the methods that do not reduce the original problem are not satisfying since they are too conservative. Among those methods, the only one that makes discoveries is the one that threshold the SVR decoder using a parametric approximation. However this method has no statistical guarantees and we can see that some isolated voxels are discovered, which seems quite spurious. The discriminative pattern derived from the clustered inference algorithm (CluDL) show that the method is less conservative. However, some reasonable paterns are also included in this solution. Finally, the solution provided by the ensemble clustered inference algorithm (EnCluDL) seems realistic as we recover the visual cortex and do not make spurious discoveries.

show()

Total running time of the script: (1 minutes 19.742 seconds)

Estimated memory usage: 2929 MB

Gallery generated by Sphinx-Gallery

Support recovery on fMRI data#

References#

Imports needed for this script#

Function to fetch and preprocess Haxby dataset#

Gathering and preprocessing Haxby dataset for a given subject#

Initializing FeatureAgglomeration object that performs the clustering#

Making the inference with several algorithms#

Plotting the results#

Analysis of the results#

This Page