compute.kernel.Rd
Compute a kernel from a given data matrix.
compute.kernel(X, kernel.func = "linear", ..., test.pos.semidef = FALSE)
a numeric matrix (or data frame) used to compute the kernel.
NA
s not allowed.
the kernel function to use. This parameter can be set to any
user defined kernel function. Widely used kernel functions are pre-implemented,
that can be used by setting kernel.func
to one of the following strings:
"kidentity"
, "abundance"
, "linear"
,
"gaussian.radial.basis"
, "poisson"
or "phylogenetic"
.
Default: "linear"
.
the kernel function arguments. Valid parameters for pre-implemented kernels are:
phylogenetic.tree
("phylogenetic"
): an instance of
phylo-class that contains a phylogenetic tree (required).
scale
("linear"
or "gaussian.radial.basis"
):
logical. Should the variables be scaled to unit variance prior the kernel
computation? Default: TRUE
.
sigma
("gaussian.radial.basis"
): double. The inverse
kernel width used by "gaussian.radial.basis"
.
method
("phylogenetic"
or "abundance"
): character.
Can be "unifrac"
or "wunifrac"
for "phylogenetic"
. Which
dissimilarity to use for "abundance"
: one of "bray"
,
"euclidean"
, "canberra"
, "manhattan"
, "kulczynski"
,
"jaccard"
, "gower"
, "altGower"
, "morisita"
,
"horn"
, "mountford"
, "raup"
, "binomial"
,
"chao"
and "cao"
.
normalization
("poisson"
): character. Can be "deseq"
(more robust), "mle"
(less robust) or "quantile"
.
boleean. If test.pos.semidef = TRUE
, the
resulting matrix is tested to be positive-semidefinite.
compute.kernel
returns an object of classes "kernel"
, a list that
contains the following components:
: the computed kernel matrix.
: the original dataset. If "kidentity"
, X
is set to
NULL
.
: the kernel function used.
: the arguments used to compute the kernel.
Lozupone C. and Knight R. (2005). UniFrac: a new phylogenetic method for comparing microbial communities. Applied and Environmental Microbiology, 71(12), 8228-8235.
Lozupone C., Hamady M., Kelley S.T. and Knight R. (2007). Quantitative and qualitative beta diversity measures lead to different insights into factors that structure microbial communities. Applied and Environmental Microbiology, 73(5), 1576-1585.
Witten D. (2011). Classification and clustering of sequencing data using a Poisson model. Annals of Applied Statistics, 5(4), 2493-2518.
data(TARAoceans)
pro.NOGs.kernel <- compute.kernel(TARAoceans$pro.NOGs, kernel.func = "abundance")