Assess importance of variables on a given PC component by computing the Crone-Crosby distance between original sample positions and sample positions obtain by a random permutation of the variables.

kernel.pca.permute(kpca.result, ncomp = 1, ..., directory = NULL)

Arguments

kpca.result

a kernel.pca object returned by the kernel.pca function.

ncomp

integer. Number of KPCA components used to compute the importance. Default: 1.

...

list of character vectors. The parameter name must be the kernel name to be considered for permutation of variables. Provided vectors length has to be equal to the number of variables of the input dataset. A kernel is performed on each unique variables values. Crone-Crosby distances are computed on each KPCA performed on resulted kernels or meta-kernels and can be displayed using the plotVar.kernel.pca.

directory

character. To limit computational burden, this argument allows to store / read temporary computed kernels.

Value

kernel.pca.permute returns a copy of the input kpca.resultresults and add values in the three entries: cc.distances, cc.variables and cc.blocks.

References

Mariette J. and Villa-Vialaneix N. (2018). Unsupervised multiple kernel learning for heterogeneous data integration. Bioinformatics, 34(6), 1009-1015.

Crone L. and Crosby D. (1995). Statistical applications of a metric on subspaces to satellite meteorology. Technometrics, 37(3), 324-328.

Author

Jerome Mariette <jerome.mariette@inrae.fr>

Nathalie Vialaneix <nathalie.vialaneix@inrae.fr>

Examples

data(TARAoceans)

# compute one kernel for the psychem dataset
phychem.kernel <- compute.kernel(TARAoceans$phychem, kernel.func = "linear")
# perform a KPCA
kernel.pca.result <- kernel.pca(phychem.kernel)

# compute importance for all variables in this kernel
kernel.pca.result <- kernel.pca.permute(kernel.pca.result, phychem = colnames(TARAoceans$phychem))