learnUMAP — learnUMAP • cyCONDOR

Projects new samples on a UMAP calculated previously for a reference data set with the same parameters as the new sample. Before executing this function, runUMAP needs to be run with ret_model = TRUE for the reference data set.

Usage

learnUMAP(
  fcd,
  input_type,
  data_slot,
  nPC = ncol(fcd[[input_type]][[data_slot]]),
  markers = colnames(fcd$expr[[data_slot]]),
  discard = FALSE,
  fcd_model,
  nEpochs = 100,
  prefix = NULL,
  nThreads = 32,
  seed = 91
)

Arguments

fcd: Flow cytometry dataset for which the UMAP coordinates should be predicted.
input_type: Data to use for the calculation of the UMAP, e.g. expr or pca. This should be the same which has been used for calculating the UMAP of the reference data set.
data_slot: Name of the input_type data slot to use e.g. orig, if no prefix was added. This should be the same which has been used for calculating the UMAP of the reference data set.
nPC: Number of PCs used for the UMAP Projection. Default = All. The number of PCs should be the same used for calculating the UMAP of the reference data set. Check the UMAP name of your reference data set e.g. using fcd_model$umap$your_umap_name.
markers: Vector of marker names to include or exclude from UMAP projection according to the discard parameter. The markers should be the same used for calculating the UMAP of the reference data set. Use the function used_markers to check which markers were used to calculate the UMAP of your fcd_model. .
discard: LOGICAL to decide if the markers specified should be included, "F", or excluded, "T", from the UMAP projection. Default = F.
fcd_model: Flow cytometry reference data set containing data associated with an existing embedding in fcd_model$extras.
nEpochs: Number of epochs to use during the optimization of the embedded coordinates. A value between 30 - 100 is a reasonable trade off between speed and thoroughness. By default, this value is set to one third the number of epochs used to build the model.
prefix: Prefix for the name of the dimensionality reduction.
nThreads: Number of threads to use, (except during stochastic gradient descent). By default nThreads = 32.
seed: A seed is set for reproducibility.

Value

learnUMAP() returns a fcd with the predicted UMAP coordinates saved in fcd$umap$expr_orig, if no prefix was set.

Details

learnUMAP

learnUMAP() uses umap_transform to project new samples contained in fcd on the embedding previously calculated in a reference data set, fcd_model, using runUMAP.