Computation of all EMD between pairs of flowFrames belonging to a flowSet. This method provides three different input modes:
the user provides directly a flowCore::flowSet loaded in memory (RAM).
the user provides directly a list of expression matrices loaded in RAM, of which the column names are the channel/marker names
the user provides (1.) a number of samples
nSamples; (2.) an ad-hoc function that takes as input an index between 1 andnSamples, and codes the method to load the corresponding expression matrix in memory; Optional row and column ranges can be provided to limit the calculation to a specific rectangle of the matrix. These i.e. can be specified as a way to split heavy calculations of large distance matrices on several computation nodes.
Usage
pairwiseEMDDist(
x,
rowRange = c(1, nSamples),
colRange = c(min(rowRange), nSamples),
loadExprMatrixFUN = NULL,
loadExprMatrixFUNArgs = NULL,
channels = NULL,
verbose = FALSE,
BPPARAM = BiocParallel::SerialParam(),
BPOPTIONS = BiocParallel::bpoptions(packages = c("flowCore")),
binSize = 0.05,
minRange = -10,
maxRange = 10
)Arguments
- x
can be:
a flowCore::flowSet
a list of expression matrices (Double matrix with named columns)
the number of samples (integer >=1)
- rowRange
the range of rows of the distance matrix to be calculated
- colRange
the range of columns of the distance matrix to be calculated
- loadExprMatrixFUN
the function used to translate an integer index into an expression matrix. In other words, the function should code how to load the
indexth expression matrix into memory. IMPORTANT: the expression matrix index should be the first function argument and should be namedexprMatrixIndex.- loadExprMatrixFUNArgs
(optional) a named list containing additional input parameters of
loadExprMatrixFUN()- channels
which channels (integer index(ices) or character(s)):
if it is a character vector, it can refer to either the channel names, or the marker names
if it is a numeric vector, it refers to the indexes of channels in
fsif NULL all scatter and fluorescent channels of
fs#' will be selected
- verbose
if
TRUE, output a message after each single distance calculation- BPPARAM
sets the
BPPARAMback-end to be used for the computation. If not provided, will useBiocParallel::SerialParam()(no task parallelization)- BPOPTIONS
sets the BPOPTIONS to be passed to
bplapply()function. Note that if you use aSnowParamsback-end, you need to specify all the packages that need to be loaded for the different CytoProcessingStep to work properly (visibility of functions). As a minimum, theflowCorepackage needs to be loaded. (hence the defaultBPOPTIONS = bpoptions(packages = c("flowCore")))- binSize
size of equal bins to approximate the marginal distributions.
- minRange
minimum value taken when approximating the marginal distributions
- maxRange
maximum value taken when approximating the marginal distributions
Examples
library(CytoPipeline)
data(OMIP021Samples)
# estimate scale transformations
# and transform the whole OMIP021Samples
transList <- estimateScaleTransforms(
ff = OMIP021Samples[[1]],
fluoMethod = "estimateLogicle",
scatterMethod = "linearQuantile",
scatterRefMarker = "BV785 - CD3")
OMIP021Trans <- CytoPipeline::applyScaleTransforms(
OMIP021Samples,
transList)
# calculate pairwise distances using only FSC-A & SSC-A channels
pwDist <- pairwiseEMDDist(
x = OMIP021Trans,
channels = c("FSC-A", "SSC-A"))