hidimstat.dcrt_pvalue#
- hidimstat.dcrt_pvalue(selection_features, X_res, sigma2, y_res, fdr=0.1, fdr_control='bhq', reshaping_function=None, scaled_statistics=False)[source]#
Calculate p-values and identify significant features using the dCRT test statistics.
This function processes the results from dCRT to identify statistically significant features while controlling for false discoveries. It assumes test statistics follow a Gaussian distribution.
- Parameters:
- selection_featuresndarray of shape (n_features,)
Boolean mask indicating which features were selected for testing
- X_resndarray of shape (n_selected, n_samples)
Residuals from feature distillation
- sigma2ndarray of shape (n_selected,)
Estimated residual variances for each tested feature
- y_resndarray of shape (n_selected, n_samples)
Response residuals for each tested feature
- fdrfloat, default=0.1
Target false discovery rate level (0 < fdr < 1)
- fdr_control{‘bhq’, ‘bhy’, ‘ebh’}, default=’bhq’
Method for FDR control: - ‘bhq’: Benjamini-Hochberg procedure - ‘bhy’: Benjamini-Hochberg-Yekutieli procedure - ‘ebh’: e-BH procedure
- reshaping_functioncallable, optional
Reshaping function for the ‘bhy’ method
- scaled_statisticsbool, default=False
Whether to standardize test statistics before computing p-values
- Returns:
- selected_variablesndarray
Indices of features deemed significant
- pvalsndarray of shape (n_features,)
P-values for all features (including unselected ones)
- tsndarray of shape (n_features,)
test statistics following a standard normal distribution for all features
Notes
The function computes test statistics as correlations between residuals, optionally scales them, and converts to p-values using a Gaussian null. Multiple testing correction is applied to control FDR at the specified level.
Examples using hidimstat.dcrt_pvalue
#

Distilled Conditional Randomization Test (dCRT) using Lasso vs Random Forest learners