Command-Line Interface (CLI)¶
The kompot CLI provides command-line access to differential expression (DE) and differential abundance (DA) analysis for pipeline integration and workflow automation.
Installation¶
The CLI is installed automatically with kompot:
pip install kompot
# or
mamba install -c bioconda kompot
Verify installation:
kompot --version
kompot --help
Overview¶
The CLI provides three main commands:
kompot dm- Compute diffusion maps (preprocessing with Palantir)kompot de- Differential expression analysiskompot da- Differential abundance analysis
All commands support:
Direct CLI arguments for common parameters
YAML/JSON config files for advanced parameters
Reading/writing
.h5adand.zarrAnnData formats
Quick Start¶
Complete Workflow¶
# 1. Compute diffusion maps (preprocessing)
kompot dm input.h5ad -o input_with_dm.h5ad \\
--pca-key X_pca \\
--n-components 10
# 2. Run differential expression
kompot de input_with_dm.h5ad -o de_results.h5ad \\
--groupby condition \\
--condition1 control \\
--condition2 treatment \\
--obsm-key DM_EigenVectors
# 3. Run differential abundance
kompot da input_with_dm.h5ad -o da_results.h5ad \\
--groupby condition \\
--condition1 control \\
--condition2 treatment \\
--obsm-key DM_EigenVectors
Diffusion Maps (Preprocessing)¶
kompot dm input.h5ad -o output.h5ad \\
--pca-key X_pca \\
--n-components 10 \\
--knn 30
Differential Expression (Basic)¶
kompot de input.h5ad -o output.h5ad \\
--groupby condition \\
--condition1 control \\
--condition2 treatment \\
--obsm-key X_pca \\
--layer logged_counts
Differential Abundance (Basic)¶
kompot da input.h5ad -o output.h5ad \\
--groupby condition \\
--condition1 control \\
--condition2 treatment \\
--obsm-key X_pca
Using Config Files¶
For complex analyses with many parameters, use config files:
# Get template (copy from installed package)
python -c "from pathlib import Path; import shutil; \\
import kompot; \\
src = Path(kompot.__file__).parent / 'cli' / 'templates' / 'de_config_minimal.yaml'; \\
shutil.copy(src, 'my_de_config.yaml')"
# Edit config file
nano my_de_config.yaml
# Run analysis
kompot de input.h5ad -o output.h5ad -c my_de_config.yaml
CLI arguments override config file values:
kompot de input.h5ad -o output.h5ad \\
-c my_config.yaml \\
--batch-size 50 # Overrides batch_size in config
Diffusion Maps Command¶
The dm command computes diffusion maps using Palantir, which provides a continuous representation of cell states needed for differential analysis.
Basic Usage¶
kompot dm INPUT -o OUTPUT [OPTIONS]
Prerequisites¶
Requires Palantir:
pip install palantirorpip install kompot[recommended]Input AnnData must contain PCA coordinates in
adata.obsm
Common Options¶
--pca-key KEY # PCA coordinates in adata.obsm (default: X_pca)
--n-components N # Number of diffusion components (default: 10)
--knn N # Number of nearest neighbors (default: 30)
--alpha FLOAT # Diffusion alpha parameter (default: 0)
Output¶
Results are stored in:
adata.obsm['DM_EigenVectors']- Diffusion map coordinates (n_cells × n_components)adata.uns['DM_EigenValues']- Eigenvalues of diffusion operator
Example: Complete Preprocessing¶
# Starting with raw AnnData (assuming PCA already computed)
kompot dm bone_marrow.h5ad -o bone_marrow_dm.h5ad \\
--pca-key X_pca \\
--n-components 10 \\
--knn 30
# Then run differential analysis
kompot de bone_marrow_dm.h5ad -o results.h5ad \\
--groupby Age \\
--condition1 Young \\
--condition2 Old \\
--obsm-key DM_EigenVectors
Why Diffusion Maps?¶
Diffusion maps capture continuous cell state transitions better than PCA alone:
Preserves the geometry of differentiation trajectories
Reduces noise while maintaining biological structure
Euclidean distance in this representation better represents biological similarity
Distance in cell-state representation is used by kompot’s covariance kernel
See the Palantir documentation for details.
Differential Expression Command¶
Basic Usage¶
kompot de INPUT -o OUTPUT [OPTIONS]
kompot de INPUT -t TABLE_OUTPUT [OPTIONS]
At least one output must be specified: -o/--output for full AnnData or -t/--table-output for CSV/TSV table.
Required Parameters¶
Either via CLI or config file:
--groupby COLUMN- Column inadata.obswith condition labels--condition1 LABEL- Reference condition label--condition2 LABEL- Comparison condition label
Output Options¶
-o, --output FILE # Output AnnData file (.h5ad or .zarr)
-t, --table-output FILE # Output DE results as table (.csv or .tsv)
The --table-output option exports only the kompot-produced columns from adata.var (gene-level statistics like mahalanobis distance, log fold change, FDR, etc.). This is useful for downstream analysis or integration with other tools.
Common Options¶
--obsm-key KEY # Cell state representation (default: DM_EigenVectors)
--layer LAYER # Expression data layer (default: None, use X)
--result-key KEY # Storage key (default: kompot_de)
--n-landmarks N # Number of landmarks (default: 5000)
--sample-col COLUMN # Sample ID column for replicates
--batch-size N # Cells per batch (default: 100)
--fdr-threshold FLOAT # FDR threshold (default: 0.05)
--null-genes N # Null genes for FDR (default: 2000)
Boolean Flags¶
--no-progress # Disable progress bars
--store-landmarks # Store landmarks for reuse
--store-additional-stats # Store extra statistics
--overwrite # Overwrite without warning
Compute Options¶
--use-gpu # Use GPU acceleration (requires CUDA-enabled JAX)
--threads N # Number of threads for JAX/NumPy/Dask (default: all cores)
Advanced Options¶
For advanced parameters (gene filtering, cell filtering, GP kernel parameters, memory management, etc.), see the configuration file templates:
kompot/cli/templates/de_config_template.yaml- Complete template with all parameterskompot/cli/templates/de_config_minimal.yaml- Minimal template with common parameters
Example: Complete Analysis¶
kompot de bone_marrow.h5ad -o results.h5ad \\
--groupby Age \\
--condition1 Young \\
--condition2 Old \\
--obsm-key DM_EigenVectors \\
--layer logged_counts \\
--sample-col Sample \\
--n-landmarks 5000 \\
--batch-size 100 \\
--fdr-threshold 0.05 \\
--null-genes 2000 \\
--store-additional-stats
Differential Abundance Command¶
Basic Usage¶
kompot da INPUT -o OUTPUT [OPTIONS]
kompot da INPUT -t TABLE_OUTPUT [OPTIONS]
At least one output must be specified: -o/--output for full AnnData or -t/--table-output for CSV/TSV table.
Required Parameters¶
Either via CLI or config file:
--groupby COLUMN- Column inadata.obswith condition labels--condition1 LABEL- Reference condition label--condition2 LABEL- Comparison condition label
Output Options¶
-o, --output FILE # Output AnnData file (.h5ad or .zarr)
-t, --table-output FILE # Output DA results as table (.csv or .tsv)
The --table-output option exports only the kompot-produced columns from adata.obs (cell-level statistics like log fold change, z-scores, PTP values, etc.). This is useful for downstream analysis or integration with other tools.
Common Options¶
--obsm-key KEY # Cell state representation (default: X_pca)
--result-key KEY # Storage key (default: kompot_da)
--n-landmarks N # Number of landmarks (default: None, all points)
--sample-col COLUMN # Sample ID column for replicates
--batch-size N # Cells per batch (default: None)
--log-fold-change-threshold FLOAT # LFC threshold (default: 1.0)
--ptp-threshold FLOAT # PTP threshold (default: 0.05)
--ls-factor FLOAT # Length scale factor (default: 10.0)
Boolean Flags¶
--store-landmarks # Store landmarks for reuse
--overwrite # Overwrite without warning
Compute Options¶
--use-gpu # Use GPU acceleration (requires CUDA-enabled JAX)
--threads N # Number of threads for JAX/NumPy/Dask (default: all cores)
Example: Complete Analysis¶
kompot da bone_marrow.h5ad -o results.h5ad \\
--groupby Age \\
--condition1 Young \\
--condition2 Old \\
--obsm-key DM_EigenVectors \\
--sample-col Sample \\
--n-landmarks 3000 \\
--log-fold-change-threshold 1.0 \\
--ptp-threshold 0.05
Configuration Files¶
YAML Format¶
Config files use standard YAML syntax:
# Required parameters
groupby: "condition"
condition1: "control"
condition2: "treatment"
# Common parameters
obsm_key: "X_pca"
layer: "logged_counts"
result_key: "kompot_de"
# Sample variance
sample_col: "sample_id"
# Performance
batch_size: 100
n_landmarks: 5000
# Significance
fdr_threshold: 0.05
null_genes: 2000
# Advanced parameters
genes: ["Gene1", "Gene2", "Gene3"] # Analyze specific genes
cell_filter: {batch: "batch1"} # Exclude batch1 cells
# GP parameters
sigma: 1.0
ls_factor: 10.0
JSON Format¶
JSON is also supported:
{
"groupby": "condition",
"condition1": "control",
"condition2": "treatment",
"obsm_key": "X_pca",
"batch_size": 100,
"fdr_threshold": 0.05
}
Config Templates¶
Kompot provides ready-to-use templates:
Minimal templates (commonly used parameters only):
kompot/cli/templates/dm_config_minimal.yamlkompot/cli/templates/de_config_minimal.yamlkompot/cli/templates/da_config_minimal.yaml
Complete templates (all available parameters with documentation):
kompot/cli/templates/dm_config_template.yamlkompot/cli/templates/de_config_template.yamlkompot/cli/templates/da_config_template.yaml
Pipeline Integration¶
Nextflow Example¶
process KOMPOT_DE {
input:
path adata
path config
output:
path "results.h5ad"
script:
"""
kompot de ${adata} -o results.h5ad -c ${config}
"""
}
Snakemake Example¶
rule kompot_de:
input:
adata = "data/{sample}.h5ad",
config = "configs/de_config.yaml"
output:
results = "results/{sample}_de.h5ad"
shell:
"kompot de {input.adata} -o {output.results} -c {input.config}"
Shell Script Example¶
#!/bin/bash
# Process multiple samples with complete workflow
for sample in sample1 sample2 sample3; do
echo "Processing ${sample}..."
# Step 1: Compute diffusion maps
kompot dm \\
data/${sample}.h5ad \\
-o temp/${sample}_dm.h5ad \\
--pca-key X_pca \\
--n-components 10
# Step 2: Differential expression
kompot de \\
temp/${sample}_dm.h5ad \\
-o results/${sample}_de.h5ad \\
--groupby condition \\
--condition1 control \\
--condition2 treatment \\
--obsm-key DM_EigenVectors \\
--batch-size 100
if [ $? -eq 0 ]; then
echo "${sample} completed successfully"
rm temp/${sample}_dm.h5ad # Cleanup intermediate file
else
echo "${sample} failed" >&2
exit 1
fi
done
Output Format¶
Differential Expression Output¶
Results stored in:
adata.var["kompot_de_{cond1}_to_{cond2}_mahalanobis"]- Significance scoresadata.var["kompot_de_{cond1}_to_{cond2}_mean_lfc"]- Mean log fold changeadata.var["kompot_de_{cond1}_to_{cond2}_is_de"]- Boolean significance flagadata.var["kompot_de_{cond1}_to_{cond2}_mahalanobis_local_fdr"]- Local FDRadata.uns["kompot_de"]- Run metadata and parameters
Differential Abundance Output¶
Results stored in:
adata.obs["kompot_da_{cond1}_to_{cond2}_lfc"]- Log fold change per celladata.obs["kompot_da_{cond1}_to_{cond2}_lfc_zscore"]- Z-scoresadata.obs["kompot_da_{cond1}_to_{cond2}_neg_log10_lfc_ptp"]- -log10 p-valuesadata.obs["kompot_da_{cond1}_to_{cond2}_lfc_direction"]- Direction (up/down/neutral)adata.uns["kompot_da"]- Run metadata and parameters
Logging and Verbosity¶
Control logging output:
# Standard logging (INFO level)
kompot de input.h5ad -o output.h5ad --groupby condition ...
# Verbose logging (DEBUG level)
kompot -v de input.h5ad -o output.h5ad --groupby condition ...
# Redirect logs
kompot de input.h5ad -o output.h5ad ... 2> analysis.log
Error Handling¶
The CLI exits with different codes:
0- Success1- General error (missing files, invalid parameters, analysis failure)130- Interrupted by user (Ctrl+C)
Check exit codes in scripts:
kompot de input.h5ad -o output.h5ad ...
if [ $? -ne 0 ]; then
echo "Analysis failed" >&2
exit 1
fi
Performance Tips¶
Memory Management¶
For large datasets:
# Reduce batch size
kompot de input.h5ad -o output.h5ad ... --batch-size 50
# Use fewer landmarks
kompot de input.h5ad -o output.h5ad ... --n-landmarks 3000
# Enable disk storage (requires config file)
# In config.yaml:
# store_arrays_on_disk: true
# disk_storage_dir: "/tmp/kompot_cache"
Speed Optimization¶
# Reduce null genes for faster FDR estimation
kompot de input.h5ad -o output.h5ad ... --null-genes 1000
# Use fewer landmarks
kompot da input.h5ad -o output.h5ad ... --n-landmarks 2000
# Disable progress bars in scripts
kompot de input.h5ad -o output.h5ad ... --no-progress
Troubleshooting¶
Common Issues¶
Missing required parameters:
Error: Missing required parameters: groupby, condition1, condition2
Solution: Provide via CLI args or config file
File not found:
Error: AnnData file not found: input.h5ad
Solution: Check file path and ensure it exists
Invalid condition label:
Error: Condition 'X' not found in column 'condition'
Solution: Check condition labels in your data
Memory error:
MemoryError or JAX out of memory
Solution: Reduce --batch-size and --n-landmarks
Getting Help¶
# General help
kompot --help
# Command-specific help
kompot de --help
kompot da --help
kompot dm --help
# Check version
kompot --version
Comparison with Python API¶
Feature |
CLI |
Python API |
|---|---|---|
Basic analysis |
✅ Simple |
✅ Simple |
Advanced parameters |
⚠️ Requires config file |
✅ Direct access |
Pipeline integration |
✅ Easy |
⚠️ Requires scripting |
Interactive exploration |
❌ Not suitable |
✅ Excellent |
Visualization |
❌ Requires separate step |
✅ Integrated |
Debugging |
⚠️ Limited |
✅ Full access |
Documentation |
✅ Built-in help |
✅ Comprehensive |
Recommendation:
Use CLI for: automated pipelines, batch processing, workflow integration
Use Python API for: interactive analysis, visualization, parameter exploration, custom workflows