demuxEM as a command line tool¶
If you have data generated by cell-hashing or nucleus-hashing, you can use
demuxEM as a command line tool to demultiplex your data. Type:
to see the usage information:
Usage: demuxEM [options] <input_raw_gene_bc_matrices_h5> <input_hto_csv_file> <output_name> demuxEM -h | --help demuxEM -v | --version
Input raw RNA expression matrix in 10x hdf5 format.
Input HTO (antibody tag) count matrix in CSV format.
Output name. All outputs will use it as the prefix.
- -p <number>, -\-threads <number>
Number of threads. [default: 1]
- -\-genome <genome>
Reference genome name. If not provided, we will infer it from the expression matrix file.
- -\-alpha-on-samples <alpha>
The Dirichlet prior concentration parameter (alpha) on samples. An alpha value < 1.0 will make the prior sparse. [default: 0.0]
- -\-min-num-genes <number>
We only demultiplex cells/nuclei with at least <number> of expressed genes. [default: 100]
- -\-min-num-umis <number>
We only demultiplex cells/nuclei with at least <number> of UMIs. [default: 100]
- -\-min-signal-hashtag <count>
Any cell/nucleus with less than <count> hashtags from the signal will be marked as unknown. [default: 10.0]
- -\-random-state <seed>
The random seed used in the KMeans algorithm to separate empty ADT droplets from others. [default: 0]
Generate a series of diagnostic plots, including the background/signal between HTO counts, estimated background probabilities, HTO distributions of cells and non-cells etc.
- -\-generate-gender-plot <genes>
Generate violin plots using gender-specific genes (e.g. Xist). <gene> is a comma-separated list of gene names.
- -h, -\-help
Print out help information.
RNA expression matrix with demultiplexed sample identities in Zarr format.
DemuxEM-calculated results in Zarr format, containing two datasets, one for HTO and one for RNA.
Optional output. A histogram plot depicting hashtag distributions of empty droplets and non-empty droplets.
Optional output. A bar plot visualizing the estimated hashtag background probability distribution.
Optional output. A histogram plot depicting hashtag distributions of not-real-cells and real-cells as defined by total number of expressed genes in the RNA assay.
Optional output. This figure consists of two plots. The first one is a horizontal bar plot depicting the percentage of RNA barcodes with at least one HTO count. The second plot is a histogram plot depicting RNA UMI distribution for singlets, doublets and unknown cells.
Optional outputs. Violin plots depicting gender-specific gene expression across samples. We can have multiple plots if a gene list is provided in ‘–generate-gender-plot’ option.
demuxEM -p 8 --generate-diagnostic-plots sample_raw_gene_bc_matrices.h5 sample_hto.csv sample_output