Compute a kernel from a given data matrix.

compute.kernel(X, kernel.func = "linear", ..., test.pos.semidef = FALSE)

Arguments

X

a numeric matrix (or data frame) used to compute the kernel. NAs not allowed.

kernel.func

the kernel function to use. This parameter can be set to any user defined kernel function. Widely used kernel functions are pre-implemented, that can be used by setting kernel.func to one of the following strings: "kidentity", "abundance", "linear", "gaussian.radial.basis", "poisson" or "phylogenetic". Default: "linear".

...

the kernel function arguments. Valid parameters for pre-implemented kernels are:

  • phylogenetic.tree ("phylogenetic"): an instance of phylo-class that contains a phylogenetic tree (required).

  • scale ("linear" or "gaussian.radial.basis"): logical. Should the variables be scaled to unit variance prior the kernel computation? Default: TRUE.

  • sigma ("gaussian.radial.basis"): double. The inverse kernel width used by "gaussian.radial.basis".

  • method ("phylogenetic" or "abundance"): character. Can be "unifrac" or "wunifrac" for "phylogenetic". Which dissimilarity to use for "abundance": one of "bray", "euclidean", "canberra", "manhattan", "kulczynski", "jaccard", "gower", "altGower", "morisita", "horn", "mountford", "raup", "binomial", "chao" and "cao".

  • normalization ("poisson"): character. Can be "deseq" (more robust), "mle" (less robust) or "quantile".

test.pos.semidef

boleean. If test.pos.semidef = TRUE, the resulting matrix is tested to be positive-semidefinite.

Value

compute.kernel returns an object of classes "kernel", a list that contains the following components:

kernel

: the computed kernel matrix.

X

: the original dataset. If "kidentity", X is set to NULL.

kernel.func

: the kernel function used.

kernel.args

: the arguments used to compute the kernel.

References

Lozupone C. and Knight R. (2005). UniFrac: a new phylogenetic method for comparing microbial communities. Applied and Environmental Microbiology, 71(12), 8228-8235.

Lozupone C., Hamady M., Kelley S.T. and Knight R. (2007). Quantitative and qualitative beta diversity measures lead to different insights into factors that structure microbial communities. Applied and Environmental Microbiology, 73(5), 1576-1585.

Witten D. (2011). Classification and clustering of sequencing data using a Poisson model. Annals of Applied Statistics, 5(4), 2493-2518.

Author

Jerome Mariette <jerome.mariette@inrae.fr>

Nathalie Vialaneix <nathalie.vialaneix@inrae.fr>

Examples

data(TARAoceans)
pro.NOGs.kernel <- compute.kernel(TARAoceans$pro.NOGs, kernel.func = "abundance")