hidimstat.knockoff_aggregation#

hidimstat.knockoff_aggregation(X, y, centered=True, shrink=False, construct_method='equi', fdr=0.1, fdr_control='bhq', reshaping_function=None, offset=1, method='quantile', statistic='lasso_cv', cov_estimator='ledoit_wolf', joblib_verbose=0, n_bootstraps=25, n_jobs=1, adaptive_aggregation=False, gamma=0.5, n_grid_gamma=20, verbose=False, memory=None, random_state=None)[source]#

Aggregation of Multiple knockoffs

This function implements the aggregation of multiple knockoffs introduced by Nguyen et al.[1]

Parameters:

X{array-like, sparse matrix} of shape (n_samples, n_features): The input samples.
yarray-like of shape (n_samples,),: The target values (class labels in classification, real numbers in regression).
centeredbool, default=True: Whether to standardize the data before doing the inference procedure.
shrinkbool, default=False: Whether to shrink the empirical covariance matrix.
construct_methodstr, default=”equi”: The knockoff construction methods. The options include: - “equi” for equi-correlated knockoff - “sdp” for optimization scheme
fdrfloat, default=0.1: The desired controlled FDR level
fdr_controlsrt, default=”bhq”: The control method for False Discovery Rate (FDR). The options include: - “bhq” for Standard Benjamini-Hochberg procedure - “bhy” for Benjamini-Hochberg-Yekutieli procedure - “ebh” for e-BH procedure
reshaping_function<class ‘function’>, default=None: The reshaping function defined in Benjamini and Yekutieli[2].
offsetint, 0 or 1, optional: The offset to calculate knockoff threshold, offset = 1 is equivalent to knockoff+.
methodsrt, default=”quantile”: The method to compute the statistical measures. The options include: - “quantile” for p-values - “e-values” for e-values
statisticsrt, default=”lasso_cv”: The method to calculate knockoff test score.
cov_estimatorsrt, default=”ledoitwolf”: The method of empirical covariance matrix estimation.
joblib_versobeint, default=0: The verbosity level of joblib: if non zero, progress messages are printed. Above 50, the output is sent to stdout. The frequency of the messages increases with the verbosity level. If it more than 10, all iterations are reported.
n_bootstrapsint, default=25: The number of bootstrapping iterations.
n_jobsint, default=1: The number of workers for parallel processing.
adaptive_aggregationbool, default=False: Whether to apply the adaptive version of the quantile aggregation method as in Nicolai Meinshausen and Bühlmann[3].
gammafloat, default=0.5: The quantile level (between 0 and 1) used for aggregation. For non-adaptive aggregation, a single gamma value is used. For adaptive aggregation, this is the starting point for the grid search over gamma values.
n_grid_gammaint, default=20: Number of gamma grid points for adaptive aggregation.
verbosebool, default=False: Whether to return the corresponding p-values of the variables along with the list of selected variables.
memorystr or joblib.Memory object, default=None: Used to cache the output of the computation of the clustering and the inference. By default, no caching is done. If a string is given, it is the path to the caching directory.
random_stateint, default=None: Fixing the seeds of the random generator.

Returns:

selected1D array, int: The vector of index of selected variables.
aggregated_pval: 1D array, float: The vector of aggregated p-values.
pvals: 1D array, float: The vector of the corresponding p-values.
aggregated_eval: 1D array, float: The vector of aggregated e-values.
evals: 1D array, float: The vector of the corresponding e-values.

References

Examples using `hidimstat.knockoff_aggregation`#

Knockoff aggregation on simulated data

hidimstat.knockoff_aggregation#

Examples using hidimstat.knockoff_aggregation#

This Page

Examples using `hidimstat.knockoff_aggregation`#