Skip to contents

How to load data

cyCONDOR provides an integrated function to prepare a condor object (flow cytometry dataset) starting from the input files in either .fcs or .csv format. All files should be saved in a single directory which path should be stated in data_path. The user can define the number of (cells) to process from each file within ‘max_cell’. If the input type is .csv the useCSV setting should be set to TRUE. Is important to keep in mind that currently all files in the data_path folder are loaded independently from which are also included in the annotation table, this can induce slight differences in auto-logicle transformation, to avoid this only include in the data_path the files you plan to analyse.

For data transformation cyCONDOR provides different options:

  • auto_logi: For HDFC and Spectral Flow data (recommended, auto-logicle transformation). This transformation gives good results also with cyTOF data, especially if you are experiencing a lot of noise with arcsinh due to negative values. auto-logicle transformation is inherited from the Cytofkit package Chen er al. 2016.
  • clr: Recommended for CITE-seq data (centered log ratio transformation)
  • arcsinh: arcsinh transformation with co-factor 5, common transformation for cyTOF data.

The last important piece to build a condor object is the annotation table. The annotation table should contain all necessary metadata used for analysis as well a column containing the names of the input files and should be supplied as .csv file. The column containing the file names should be stated in ‘filename_col’. Below an exemplary metadata table is shown.

read.csv("../.test_files/metadata.csv")
##    filename sample_ID group batch
## 1   ID1.fcs       ID1  ctrl  Day1
## 2   ID2.fcs       ID2   pat  Day1
## 3   ID3.fcs       ID3  ctrl  Day2
## 4   ID4.fcs       ID4   pat  Day2
## 5   ID5.fcs       ID5  ctrl  Day2
## 6   ID6.fcs       ID6   pat  Day2
## 7   ID7.fcs       ID7  ctrl  Day3
## 8   ID8.fcs       ID8   pat  Day3
## 9   ID9.fcs       ID9  ctrl  Day3
## 10 ID10.fcs      ID10   pat  Day3

Unwanted parameters that are not important for the downstream analysis (e.g. Time) and should be removed can be listed in remove_param. In the prep_fcd function we also set a seed for reproducibility since the subsetting to max_cell is otherwise randomized.

condor <- prep_fcd(data_path = "../.test_files/fcs/", 
                   max_cell = 1000, 
                   useCSV = FALSE, 
                   transformation = "auto_logi", 
                   remove_param = c("FSC-H", "SSC-H", "FSC-W", "SSC-W", "Time"), 
                   anno_table = "../.test_files/metadata.csv", 
                   filename_col = "filename",
                   seed = 91)

Session Info

info <- sessionInfo()

info
## R version 4.3.1 (2023-06-16)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.3 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Etc/UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] cyCONDOR_0.2.0
## 
## loaded via a namespace (and not attached):
##   [1] IRanges_2.34.1              Rmisc_1.5.1                
##   [3] urlchecker_1.0.1            nnet_7.3-19                
##   [5] CytoNorm_2.0.1              TH.data_1.1-2              
##   [7] vctrs_0.6.4                 digest_0.6.33              
##   [9] png_0.1-8                   shape_1.4.6                
##  [11] proxy_0.4-27                slingshot_2.8.0            
##  [13] ggrepel_0.9.4               parallelly_1.36.0          
##  [15] MASS_7.3-60                 pkgdown_2.0.7              
##  [17] reshape2_1.4.4              httpuv_1.6.12              
##  [19] foreach_1.5.2               BiocGenerics_0.46.0        
##  [21] withr_2.5.1                 ggrastr_1.0.2              
##  [23] xfun_0.40                   ggpubr_0.6.0               
##  [25] ellipsis_0.3.2              survival_3.5-7             
##  [27] memoise_2.0.1               hexbin_1.28.3              
##  [29] ggbeeswarm_0.7.2            RProtoBufLib_2.12.1        
##  [31] princurve_2.1.6             profvis_0.3.8              
##  [33] ggsci_3.0.0                 systemfonts_1.0.5          
##  [35] ragg_1.2.6                  zoo_1.8-12                 
##  [37] GlobalOptions_0.1.2         DEoptimR_1.1-3             
##  [39] Formula_1.2-5               prettyunits_1.2.0          
##  [41] promises_1.2.1              scatterplot3d_0.3-44       
##  [43] rstatix_0.7.2               globals_0.16.2             
##  [45] ps_1.7.5                    rstudioapi_0.15.0          
##  [47] miniUI_0.1.1.1              generics_0.1.3             
##  [49] ggcyto_1.28.1               base64enc_0.1-3            
##  [51] processx_3.8.2              curl_5.1.0                 
##  [53] S4Vectors_0.38.2            zlibbioc_1.46.0            
##  [55] flowWorkspace_4.12.2        polyclip_1.10-6            
##  [57] randomForest_4.7-1.1        GenomeInfoDbData_1.2.10    
##  [59] RBGL_1.76.0                 ncdfFlow_2.46.0            
##  [61] RcppEigen_0.3.3.9.4         xtable_1.8-4               
##  [63] stringr_1.5.0               desc_1.4.2                 
##  [65] doParallel_1.0.17           evaluate_0.22              
##  [67] S4Arrays_1.0.6              hms_1.1.3                  
##  [69] glmnet_4.1-8                GenomicRanges_1.52.1       
##  [71] irlba_2.3.5.1               colorspace_2.1-0           
##  [73] harmony_1.1.0               reticulate_1.34.0          
##  [75] readxl_1.4.3                magrittr_2.0.3             
##  [77] lmtest_0.9-40               readr_2.1.4                
##  [79] Rgraphviz_2.44.0            later_1.3.1                
##  [81] lattice_0.22-5              future.apply_1.11.0        
##  [83] robustbase_0.99-0           XML_3.99-0.15              
##  [85] cowplot_1.1.1               matrixStats_1.1.0          
##  [87] xts_0.13.1                  class_7.3-22               
##  [89] Hmisc_5.1-1                 pillar_1.9.0               
##  [91] nlme_3.1-163                iterators_1.0.14           
##  [93] compiler_4.3.1              RSpectra_0.16-1            
##  [95] stringi_1.7.12              gower_1.0.1                
##  [97] minqa_1.2.6                 SummarizedExperiment_1.30.2
##  [99] lubridate_1.9.3             devtools_2.4.5             
## [101] CytoML_2.12.0               plyr_1.8.9                 
## [103] crayon_1.5.2                abind_1.4-5                
## [105] locfit_1.5-9.8              sp_2.1-1                   
## [107] sandwich_3.0-2              pcaMethods_1.92.0          
## [109] dplyr_1.1.3                 codetools_0.2-19           
## [111] multcomp_1.4-25             textshaping_0.3.7          
## [113] recipes_1.0.8               openssl_2.1.1              
## [115] Rphenograph_0.99.1          TTR_0.24.3                 
## [117] bslib_0.5.1                 e1071_1.7-13               
## [119] destiny_3.14.0              GetoptLong_1.0.5           
## [121] ggplot.multistats_1.0.0     mime_0.12                  
## [123] splines_4.3.1               circlize_0.4.15            
## [125] Rcpp_1.0.11                 sparseMatrixStats_1.12.2   
## [127] cellranger_1.1.0            knitr_1.44                 
## [129] utf8_1.2.4                  clue_0.3-65                
## [131] lme4_1.1-35.1               fs_1.6.3                   
## [133] listenv_0.9.0               checkmate_2.3.0            
## [135] DelayedMatrixStats_1.22.6   pkgbuild_1.4.2             
## [137] ggsignif_0.6.4              tibble_3.2.1               
## [139] Matrix_1.6-1.1              rpart.plot_3.1.1           
## [141] callr_3.7.3                 tzdb_0.4.0                 
## [143] tweenr_2.0.2                pkgconfig_2.0.3            
## [145] pheatmap_1.0.12             tools_4.3.1                
## [147] cachem_1.0.8                smoother_1.1               
## [149] fastmap_1.1.1               rmarkdown_2.25             
## [151] scales_1.2.1                grid_4.3.1                 
## [153] usethis_2.2.2               broom_1.0.5                
## [155] sass_0.4.7                  graph_1.78.0               
## [157] carData_3.0-5               RANN_2.6.1                 
## [159] rpart_4.1.21                farver_2.1.1               
## [161] yaml_2.3.7                  MatrixGenerics_1.12.3      
## [163] foreign_0.8-85              ggthemes_4.2.4             
## [165] cli_3.6.1                   purrr_1.0.2                
## [167] stats4_4.3.1                lifecycle_1.0.3            
## [169] uwot_0.1.16                 askpass_1.2.0              
## [171] caret_6.0-94                Biobase_2.60.0             
## [173] mvtnorm_1.2-3               lava_1.7.3                 
## [175] sessioninfo_1.2.2           backports_1.4.1            
## [177] cytolib_2.12.1              timechange_0.2.0           
## [179] gtable_0.3.4                rjson_0.2.21               
## [181] umap_0.2.10.0               ggridges_0.5.4             
## [183] parallel_4.3.1              pROC_1.18.5                
## [185] limma_3.56.2                jsonlite_1.8.7             
## [187] edgeR_3.42.4                RcppHNSW_0.5.0             
## [189] bitops_1.0-7                ggplot2_3.4.4              
## [191] Rtsne_0.16                  FlowSOM_2.8.0              
## [193] ranger_0.16.0               flowCore_2.12.2            
## [195] jquerylib_0.1.4             timeDate_4022.108          
## [197] shiny_1.7.5.1               ConsensusClusterPlus_1.64.0
## [199] htmltools_0.5.6.1           diffcyt_1.20.0             
## [201] glue_1.6.2                  XVector_0.40.0             
## [203] VIM_6.2.2                   RCurl_1.98-1.13            
## [205] rprojroot_2.0.3             gridExtra_2.3              
## [207] boot_1.3-28.1               TrajectoryUtils_1.8.0      
## [209] igraph_1.5.1                R6_2.5.1                   
## [211] tidyr_1.3.0                 SingleCellExperiment_1.22.0
## [213] vcd_1.4-11                  cluster_2.1.4              
## [215] pkgload_1.3.3               GenomeInfoDb_1.36.4        
## [217] ipred_0.9-14                nloptr_2.0.3               
## [219] DelayedArray_0.26.7         tidyselect_1.2.0           
## [221] vipor_0.4.5                 htmlTable_2.4.2            
## [223] ggforce_0.4.1               CytoDx_1.20.0              
## [225] car_3.1-2                   future_1.33.0              
## [227] ModelMetrics_1.2.2.2        munsell_0.5.0              
## [229] laeken_0.5.2                data.table_1.14.8          
## [231] htmlwidgets_1.6.2           ComplexHeatmap_2.16.0      
## [233] RColorBrewer_1.1-3          rlang_1.1.1                
## [235] remotes_2.4.2.1             colorRamps_2.3.1           
## [237] ggnewscale_0.4.9            fansi_1.0.5                
## [239] hardhat_1.3.0               beeswarm_0.4.0             
## [241] prodlim_2023.08.28