Plotting

Volcano Plots

kompot.plot.volcano_de(adata: AnnData, lfc_key: str = None, score_key: str = None, condition1: str | None = None, condition2: str | None = None, n_top_genes: int | None = None, highlight_genes: List[str] | Dict[str, str] | List[Dict[str, Any]] | None = None, color: str | None = None, background_cmap: str | Colormap = None, color_discrete_map: Dict[str, str] | None = None, vmin: float | str | None = None, vmax: float | str | None = None, vcenter: float | None = None, gene_labels: bool | int | List[str] | Dict[str, str] = 10, figsize: Tuple[float, float] = (10, 8), title: str | None = None, xlabel: str | None = 'Log Fold Change', ylabel: str | None = None, n_x_ticks: int = 3, n_y_ticks: int = 3, color_up: str = '#d73027', color_down: str = '#4575b4', color_background: str = '#c0c0c0', alpha_background: float = 1.0, point_size: float = 5, font_size: float = 9, text_offset: Tuple[float, float] = (2, 2), text_kwargs: Dict[str, Any] | None = None, grid: bool = True, grid_kwargs: Dict[str, Any] | None = None, ax: Axes | None = None, legend_loc: str = 'best', legend_fontsize: float | None = None, legend_title_fontsize: float | None = None, show_legend: bool = True, sort_key: str | None = None, return_fig: bool = False, save: str | None = None, run_id: int = -1, legend_ncol: int | None = None, group: str | None = None, y_axis_type: str = 'mahalanobis', significance_threshold: float | Dict[str, float] | None = None, update_de_classification: bool = False, direction_column: str | None = None, show_thresholds: bool = True, **kwargs) Figure | NoneView on GitHub

Create a volcano plot from Kompot differential expression results.

Parameters:
  • adata (AnnData) – AnnData object containing differential expression results in .var

  • lfc_key (str, optional) – Key in adata.var for log fold change values. If None, will try to infer from kompot_de_ keys.

  • score_key (str, optional) – Key in adata.var for significance scores. Default is "kompot_de_mahalanobis"

  • condition1 (str, optional) – Name of condition 1 (negative log fold change)

  • condition2 (str, optional) – Name of condition 2 (positive log fold change)

  • n_top_genes (int, optional) – If specified, highlight this number of top genes by score instead of using DE classification. Cannot be used together with significance_threshold. If not specified (None), will use DE classification from is_de column when available. Ignored if highlight_genes is provided.

  • highlight_genes (list of str, dict of {str: str}, or list of dict, optional) – Genes to highlight. Can be a list of gene names, a dict mapping gene names to colors, or a list of dicts with keys 'genes' (required), 'name' (optional), and 'color' (optional). If provided, overrides n_top_genes.

  • color (str, optional) – Key in adata.var to use for coloring background genes. Can be continuous or categorical.

  • background_cmap (str or Colormap, optional) – Colormap to use for background coloring. Default is for continuous ‘Spectral_r’.

  • color_discrete_map (dict, optional) – Mapping of category values to colors for categorical color. If not provided, colors will be selected from the colormap.

  • vmin (float or str, optional) – Minimum value for colormap normalization. If a string starting with ‘p’ followed by a number, uses that percentile (e.g., ‘p5’ for 5th percentile).

  • vmax (float or str, optional) – Maximum value for colormap normalization. If a string starting with ‘p’ followed by a number, uses that percentile (e.g., ‘p95’ for 95th percentile).

  • vcenter (float, optional) – Center value for diverging colormaps. If provided with vmin/vmax, ensures proper ordering.

  • gene_labels (bool, int, list of str, or dict, optional) – Controls which genes get labeled with their names: - True: label all highlighted genes - False: label no genes - int: label top N genes by score (default: 10) - list of str: label specific genes by name - dict: label genes with custom labels (gene_name -> custom_label)

  • figsize (tuple, optional) – Figure size as (width, height) in inches

  • title (str, optional) – Plot title. If None and conditions provided, uses “{condition2} vs {condition1}”

  • xlabel (str, optional) – Label for x-axis

  • ylabel (str, optional) – Label for y-axis

  • n_x_ticks (int, optional) – Number of ticks to display on the x-axis (default: 3)

  • n_y_ticks (int, optional) – Number of ticks to display on the y-axis (default: 3)

  • color_up (str, optional) – Color for up-regulated genes

  • color_down (str, optional) – Color for down-regulated genes

  • color_background (str, optional) – Color for background genes when not using color

  • alpha_background (float, optional) – Alpha value for background genes (default: 1.0)

  • point_size (float, optional) – Size of points for background genes

  • font_size (float, optional) – Font size for gene labels

  • text_offset (tuple, optional) – Offset (x, y) in points for gene labels from their points

  • text_kwargs (dict, optional) – Additional parameters for text labels

  • grid (bool, optional) – Whether to show grid lines

  • grid_kwargs (dict, optional) – Additional parameters for grid

  • ax (matplotlib.axes.Axes, optional) – Axes to plot on. If None, creates new figure

  • legend_loc (str, optional) – Location for the legend (‘best’, ‘upper right’, ‘lower left’, etc., or ‘none’ to hide)

  • legend_fontsize (float, optional) – Font size for the legend text. If None, uses matplotlib defaults.

  • legend_title_fontsize (float, optional) – Font size for the legend title. If None, uses matplotlib defaults.

  • show_legend (bool, optional) – Whether to show the legend (default: True)

  • legend_ncol (int, optional) – Number of columns in the legend. If None, automatically determined.

  • sort_key (str, optional) – Key to sort genes by. If None, sorts by score_key

  • return_fig (bool, optional) – If True, returns the figure and axes

  • save (str, optional) – Path to save figure. If None, figure is not saved

  • run_id (int, optional) – Specific run ID to use for fetching field names from run history. Negative indices count from the end (-1 is the latest run). If None, uses the latest run information.

  • group (str, optional) – If provided, use data for a specific group/subset analyzed with the ‘groups’ parameter in compute_differential_expression. Will use the values from adata.varm instead of adata.var for Mahalanobis distances, and mean fold changes.

  • y_axis_type (str, optional) – Type of values to use for the y-axis: “mahalanobis” (default), “local_fdr”, “tail_fdr”, “ptp”, or a custom column name from adata.var. When using FDR or ptp values, they are -log10 transformed for display.

  • significance_threshold (float or dict, optional) – Significance threshold for the y-axis values. A float sets a single threshold shown as a horizontal line. A dict maps y-axis types to thresholds (e.g., {"local_fdr": 0.05, "ptp": 0.01}); cells must pass all thresholds, and no threshold line is drawn. For "mahalanobis" this is a minimum distance; for "local_fdr", "tail_fdr", and "ptp" it is a maximum value.

  • update_de_classification (bool, optional) – Whether to update the differential expression classification column based on the new significance threshold. Applicable for FDR and ptp y_axis_types (default: False).

  • direction_column (str, optional) – Name of the differential expression boolean column to update if update_de_classification=True. If None, tries to infer from the score_key.

  • show_thresholds (bool, optional) – Whether to show threshold lines on the plot (default: True).

  • **kwargs – Additional parameters passed to plt.scatter

Return type:

If return_fig is True, returns (fig, ax)

kompot.plot.volcano_da(adata: AnnData, lfc_key: str | None = None, ptp_key: str | None = None, group_key: str | None = None, log_transform_ptp: bool = True, lfc_threshold: float | None = None, ptp_threshold: float | None = None, color: str | List[str] | None = None, alpha_background: float = 1.0, highlight_subset: ndarray | List[bool] | None = None, highlight_color: str = '#d73027', figsize: Tuple[float, float] = (10, 8), title: str | None = 'Differential Abundance Volcano Plot', xlabel: str | None = 'Log Fold Change', ylabel: str | None = '-log10(ptp)', n_x_ticks: int = 3, n_y_ticks: int = 3, legend_loc: str = 'best', legend_fontsize: float | None = None, legend_title_fontsize: float | None = None, show_legend: bool = True, grid: bool = True, grid_kwargs: Dict[str, Any] | None = None, ax: Axes | None = None, palette: str | List[str] | Dict[str, str] | None = None, save: str | None = None, return_fig: bool = False, run_id: int = -1, legend_ncol: int | None = None, update_direction: bool = False, direction_column: str | None = None, show_thresholds: bool = True, show_colorbar: bool = True, cmap: str | Colormap | None = None, vcenter: float | None = None, vmin: float | None = None, vmax: float | None = None, **kwargs) Figure | NoneView on GitHub

Create a volcano plot for differential abundance results.

This function visualizes cells in a 2D volcano plot with log fold change on the x-axis and significance (-log10 PTP (Posterior Tail Probability)) on the y-axis. Cells can be colored by any column in adata.obs.

Parameters:
  • adata (AnnData) – AnnData object containing differential abundance results

  • lfc_key (str, optional) – Key in adata.obs for log fold change values. If None, will try to infer from kompot_da_ keys.

  • ptp_key (str, optional) – Key in adata.obs for PTPs (Posterior Tail Probabilities). Posterior Tail Probability is a significance measure score similar to p-value. If None, will try to infer from kompot_da_ keys.

  • group_key (str, optional) – Key in adata.obs to group cells by (for coloring)

  • log_transform_ptp (bool, optional) – Whether to -log10 transform PTPs (Posterior Tail Probabilities) for the y-axis

  • lfc_threshold (float, optional) – Log fold change threshold for significance (for drawing threshold lines)

  • ptp_threshold (float, optional) – PTP (Posterior Tail Probability) threshold for significance (for drawing threshold lines)

  • color (str or list of str, optional) – Keys in adata.obs for coloring cells. Requires scanpy.

  • alpha_background (float, optional) – Alpha value for background cells (below threshold). Default is 1.0 (no transparency)

  • highlight_subset (array or list, optional) – Boolean mask to highlight specific cells

  • highlight_color (str, optional) – Color for highlighted cells

  • figsize (tuple, optional) – Figure size as (width, height) in inches

  • title (str, optional) – Plot title

  • xlabel (str, optional) – Label for x-axis

  • ylabel (str, optional) – Label for y-axis

  • n_x_ticks (int, optional) – Number of ticks to display on the x-axis (default: 3)

  • n_y_ticks (int, optional) – Number of ticks to display on the y-axis (default: 3)

  • legend_loc (str, optional) – Location for the legend (‘best’, ‘upper right’, ‘lower left’, etc., or ‘none’ to hide)

  • legend_fontsize (float, optional) – Font size for the legend text. If None, uses matplotlib defaults.

  • legend_title_fontsize (float, optional) – Font size for the legend title. If None, uses matplotlib defaults.

  • show_legend (bool, optional) – Whether to show the legend (default: True)

  • grid (bool, optional) – Whether to show grid lines

  • grid_kwargs (dict, optional) – Additional parameters for grid

  • ax (matplotlib.axes.Axes, optional) – Axes to plot on. If None, creates new figure

  • palette (str, list, or dict, optional) – Color palette to use for categorical coloring

  • legend_ncol (int, optional) – Number of columns in the legend. If None, automatically determined based on the number of categories.

  • save (str, optional) – Path to save figure. If None, figure is not saved

  • show (bool, optional) – Whether to show the plot

  • return_fig (bool, optional) – If True, returns the figure and axes

  • run_id (int, optional) – Specific run ID to use for fetching field names from run history. Negative indices count from the end (-1 is the latest run). If None, uses the latest run information.

  • update_direction (bool, optional) – Whether to update the direction column based on the provided thresholds before plotting (default: False)

  • direction_column (str, optional) – Direction column to update if update_direction=True. If None, infers from run_id.

  • show_thresholds (bool, optional) – Whether to display horizontal and vertical threshold lines (default: True). Set to False to hide threshold lines.

  • show_colorbar (bool, optional) – Whether to display colorbar for numeric color columns (default: True). Set to False to hide colorbar.

  • condition1 (str, optional) – Name of condition 1 (denominator in fold change)

  • condition2 (str, optional) – Name of condition 2 (numerator in fold change)

  • **kwargs – Additional parameters passed to plt.scatter

Return type:

If return_fig is True, returns (fig, ax)

kompot.plot.multi_volcano_da(adata: AnnData, groupby: str, lfc_key: str | None = None, ptp_key: str | None = None, log_transform_ptp: bool = True, lfc_threshold: float | None = None, ptp_threshold: float | None = None, color: str | List[str] | None = None, alpha_background: float = 1.0, highlight_subset: ndarray | List[bool] | None = None, highlight_color: str = '#d73027', figsize: Tuple[float, float] | None = None, title: str | None = 'Differential Abundance Volcano Plot', xlabel: str | None = None, ylabel: str | None = '-log10(PTP (Posterior Tail Probability))', n_x_ticks: int = 3, n_y_ticks: int = 0, legend_loc: str = 'bottom', legend_fontsize: float | None = None, legend_title_fontsize: float | None = None, show_legend: bool | None = None, grid: bool = True, grid_kwargs: Dict[str, Any] | None = None, palette: str | List[str] | Dict[str, str] | None = None, show_thresholds: bool = False, plot_width_factor: float = 10.0, share_y: bool = True, layout_config: Dict[str, float] | None = None, background_plot: Literal['kde', 'violin'] | None = None, background_alpha: float = 0.5, background_color: str = '#E6E6E6', background_edgecolor: str = '#808080', background_height_factor: float = 0.6, background_kwargs: Dict[str, Any] | None = None, save: str | None = None, return_fig: bool = False, run_id: int = -1, update_direction: bool = False, direction_column: str | None = None, cmap: str | Colormap | None = None, vcenter: float | None = None, vmin: float | None = None, vmax: float | None = None, **kwargs) Figure | NoneView on GitHub

Create multiple volcano plots for differential abundance results, one per group.

This function creates a panel of volcano plots, one for each unique value in the groupby column. Each plot is wider than tall (by default 10x wider than tall) and is aligned with other plots. Only the bottom plot shows x-axis labels and ticks, only the middle plot shows the y-axis label, and y-axis ticks are hidden for all plots. Group labels are placed to the right of each plot, aligned with the plot edge. Each plot has a box outline by default, and points are drawn with full opacity (no transparency). If the color and groupby columns are identical, the legend is hidden. Vertical lines (both threshold and center line at 0) are hidden by default but can be enabled with show_thresholds=True.

Parameters:
  • adata (AnnData) – AnnData object containing differential abundance results

  • groupby (str) – Column in adata.obs to group cells by (for separating into multiple plots)

  • lfc_key (str, optional) – Key in adata.obs for log fold change values. If None, will try to infer from kompot_da_ keys.

  • ptp_key (str, optional) – Key in adata.obs for PTPs (Posterior Tail Probabilities). Posterior Tail Probability is a significance measure score similar to p-value. If None, will try to infer from kompot_da_ keys.

  • log_transform_ptp (bool, optional) – Whether to -log10 transform PTPs (Posterior Tail Probabilities) for the y-axis

  • lfc_threshold (float, optional) – Log fold change threshold for significance (for drawing threshold lines)

  • ptp_threshold (float, optional) – PTP (Posterior Tail Probability) threshold for significance (for drawing threshold lines)

  • color (str or list of str, optional) – Keys in adata.obs for coloring cells. Requires scanpy. If identical to groupby, the legend will be hidden.

  • alpha_background (float, optional) – Alpha value for background cells (below threshold). Default is 1.0 (no transparency)

  • highlight_subset (array or list, optional) – Boolean mask to highlight specific cells

  • highlight_color (str, optional) – Color for highlighted cells

  • figsize (tuple, optional) – Figure size as (width, height) in inches. If None, it will be calculated automatically based on the number of groups and layout parameters.

  • title (str, optional) – Plot title

  • xlabel (str, optional) – Label for x-axis (only shown on bottom plot). If None, it will be automatically generated based on condition names extracted from lfc_key if available.

  • ylabel (str, optional) – Label for y-axis (only shown on middle plot)

  • n_x_ticks (int, optional) – Number of ticks to display on the x-axis (default: 3)

  • n_y_ticks (int, optional) – Number of ticks to display on the y-axis (default: 0, no y-ticks)

  • legend_loc (str, optional) – Location for the legend (‘bottom’, ‘right’, ‘best’, ‘upper right’, etc.)

  • legend_fontsize (float, optional) – Font size for the legend text

  • legend_title_fontsize (float, optional) – Font size for the legend title

  • show_legend (bool, optional) – Whether to show the legend. If None (default), legend will be shown except when color column is identical to groupby column. If explicitly set to True or False, this setting will override the automatic behavior.

  • grid (bool, optional) – Whether to show grid lines

  • grid_kwargs (dict, optional) – Additional parameters for grid

  • palette (str, list, or dict, optional) – Color palette to use for categorical coloring

  • show_thresholds (bool, optional) – Whether to display threshold lines on the plots (default: False)

  • show_colorbar (bool, optional) – Whether to display colorbars in individual volcano plots (default: False in multi_volcano_da)

  • plot_width_factor (float, optional) – Width factor for each volcano plot. Higher values make plots wider relative to their height. Default is 10.0 (plots are 10x wider than tall). This is maintained regardless of the number of groups.

  • share_y (bool, optional) – Whether to use the same y-axis limits for all plots (default: True)

  • layout_config (dict, optional) – Configuration for controlling plot layout spacing. Keys include: - ‘unit_size’: Base unit size in inches (default: 0.15) - ‘title_height’: Height for title area in units (default: 2) - ‘legend_bottom_margin’: Distance from bottom of figure to legend/colorbar in units (default: 3) - ‘legend_plot_gap’: Gap between last plot and legend/colorbar in units (default: 3) - ‘legend_height’: Minimum height for legend/colorbar area in units (default: 3) - ‘plot_height’: Height for each plot in units (default: 4) - ‘plot_width’: Width for each plot in units (default: plot_width_factor * plot_height) - ‘label_width’: Width for group labels in units (default: 4) - ‘top_margin’: Top margin in units (default: 1) - ‘plot_spacing’: Spacing between plots in units (default: 0.2) - ‘y_label_width’: Width for y-axis label in units (default: 2) - ‘y_label_offset’: Offset of y-axis label from plots in units (default: 0.5)

  • background_plot (str, optional) – Type of background density plot to display. Options are “kde” or “violin”. If None (default), no background density plot is shown.

  • background_alpha (float, optional) – Alpha (transparency) value for the background density plot (default: 0.5)

  • background_color (str, optional) – Color for the background density plot (default: “#E6E6E6”, light gray)

  • background_edgecolor (str, optional) – Color for the outline of the background density plot (default: “#808080”, medium gray)

  • background_height_factor (float, optional) – Controls the height of the background plot as a fraction of the y-axis range (default: 0.6). Higher values make the KDE/violin taller, lower values make it shorter.

  • background_kwargs (dict, optional) – Additional parameters for the background density plot. For KDE: "bw_method", "show_2d_kde", "contour_levels", "contour_cmap", "contour_alpha". For violin: "showmeans", "showmedians", "showextrema".

  • save (str, optional) – Path to save figure. If None, figure is not saved

  • show (bool, optional) – Whether to show the plot

  • return_fig (bool, optional) – If True, returns the figure and axes

  • run_id (int, optional) – Specific run ID to use for fetching field names from run history

  • update_direction (bool, optional) – Whether to update the direction column based on the provided thresholds before plotting (default: False). This is only applied once to the full dataset, not to individual group subsets.

  • direction_column (str, optional) – Direction column to update if update_direction=True. If None, infers from run_id.

  • cmap (str or matplotlib.cm.Colormap, optional) – Colormap to use for numeric color values. If not provided, automatically selects ‘RdBu_r’ with vcenter=0 for columns containing ‘log_fold_change’ or ‘lfc’, otherwise defaults to “Spectral_r”.

  • vcenter (float, optional) – Value to center the colormap at. Only applies to diverging colormaps. If not specified but a column containing ‘log_fold_change’ or ‘lfc’ is used for coloring, defaults to 0.

  • vmin (float, optional) – Minimum value for the colormap. If not provided, uses the minimum value in the data.

  • vmax (float, optional) – Maximum value for the colormap. If not provided, uses the maximum value in the data.

  • **kwargs – Additional parameters passed to plt.scatter

Return type:

If return_fig is True, returns (fig, axes_list)

Expression Plots

kompot.plot.plot_gene_expression(adata: AnnData, gene: str, lfc_key: str | None = None, score_key: str | None = None, condition1: str | None = None, condition2: str | None = None, basis: str | None = 'X_umap', figsize: Tuple[float, float] = (12, 12), cmap_expression: str = 'Spectral_r', cmap_fold_change: str = 'RdBu_r', title: str | None = None, run_id: int = -1, layer: str | None = None, save: str | None = None, return_fig: bool = False, **kwargs) Figure | NoneView on GitHub

Visualize expression patterns for a specific gene across conditions.

Creates a figure with multiple panels showing original expression, smoothed expression for each condition, and fold change.

Parameters:
  • adata (AnnData) – AnnData object containing differential expression results

  • gene (str) – Name of the gene to visualize

  • lfc_key (str, optional) – Key in adata.var for log fold change values. If None, will try to infer from kompot_de_ keys.

  • score_key (str, optional) – Key in adata.var for significance scores. If None, will try to infer from kompot_de_ keys.

  • condition1 (str, optional) – Name of condition 1 (denominator in fold change)

  • condition2 (str, optional) – Name of condition 2 (numerator in fold change)

  • basis (str or None, optional) – Key in adata.obsm for the embedding coordinates (default: “X_umap”). If None, will use cell index for x-axis instead of embeddings.

  • figsize (tuple, optional) – Figure size as (width, height) in inches

  • cmap_expression (str, optional) – Colormap for expression plots

  • cmap_fold_change (str, optional) – Colormap for fold change plot

  • title (str, optional) – Overall figure title. If None, uses gene name.

  • run_id (int, optional) – Run ID to use. Default is -1 (latest run).

  • layer (str, optional) – Layer in AnnData to use for expression values. If None, uses adata.X or infers from run information.

  • save (str, optional) – Path to save figure. If None, figure is not saved

  • return_fig (bool, optional) – If True, returns the figure and axes

  • **kwargs – Additional parameters passed to scatter plot functions

Return type:

If return_fig is True, returns (fig, axes)

Heatmaps

kompot.plot.heatmap(adata: AnnData, var_names: List[str] | Sequence[str] | None = None, groupby: str = None, n_top_genes: int = 20, genes: List[str] | Sequence[str] | None = None, score_key: str | None = None, layer: str | None = None, standard_scale: str | int | None = 'var', cmap: str | Colormap | None = None, dendrogram: bool = False, cluster_rows: bool = True, cluster_cols: bool = True, dendrogram_color: str = 'black', figsize: Tuple[float, float] | None = None, tile_aspect_ratio: float = 1.0, tile_size: float = 0.3, show_gene_labels: bool = True, show_group_labels: bool = True, gene_labels_size: int = 12, group_labels_size: int = 12, colorbar_title: str | None = None, colorbar_kwargs: Dict[str, Any] | None = None, n_colorbar_ticks: int | None = 3, layout_config: Dict[str, float] | None = None, title: str | None = None, sort_genes: bool = True, vcenter: float | str | None = None, vmin: float | str | None = None, vmax: float | str | None = None, ax: Axes | None = None, draw_values: bool = False, return_fig: bool = False, return_data: bool = False, save: str | None = None, run_id: int = -1, condition_column: str | None = None, observed: bool = True, condition1: str | None = None, condition2: str | None = None, condition1_name: str | None = None, condition2_name: str | None = None, exclude_groups: str | List[str] | None = None, fold_change_mode: bool = False, split_dot_mode: bool = False, max_cell_count: int | None = None, **kwargs)View on GitHub

Create a heatmap visualizing gene expression data for two conditions.

By default, the heatmap displays expression values with diagonally split cells, where the lower-left triangle shows values for the first condition and the upper-right triangle shows values for the second condition. This creates a compact visualization that highlights differences between conditions.

When fold_change_mode=True, each cell is a single square colored by the fold change (difference between means) between the two conditions, providing a simpler visualization focused on the differential expression.

When split_dot_mode=True, the heatmap displays dots split in half vertically, where the left half shows values for the first condition and the right half shows values for the second condition. The size of each half-dot is determined by the number of cells in that condition for that group, creating a visualization that highlights both expression differences and relative group sizes simultaneously.

Genes are shown on the y-axis and groups (cell types, clusters, etc.) are shown on the x-axis, with a legend and colorbar positioned to the right of the plot.

Parameters:
  • adata (AnnData) – AnnData object containing expression data

  • var_names (list, optional) – List of genes to include in the heatmap. If None, will use top genes based on score_key.

  • groupby (str, optional) – Key in adata.obs for grouping cells

  • n_top_genes (int, optional) – Number of top genes to include if var_names is None

  • genes (list, optional) – Alternative parameter name for specifying genes to include. Takes precedence over var_names if provided.

  • score_key (str, optional) – Key in adata.var for significance scores. If None, will try to infer from run information.

  • layer (str, optional) – Layer in AnnData to use for expression values. If None, uses .X

  • standard_scale (str or int, optional) – Whether to scale the expression values (‘var’, ‘group’ or 0, 1). Default is ‘var’ for gene-wise z-scoring. When any z-scoring is applied, the colormap is automatically centered at 0 (vcenter=0), uses symmetric limits (equal positive and negative ranges), and uses a divergent colormap unless vcenter, vmin, vmax, or cmap is explicitly specified.

  • cmap (str or colormap, optional) – Colormap to use for the heatmap. If None, defaults to “coolwarm” (divergent) when z-scoring is applied, “Reds” in split dot mode, and “viridis” (sequential) otherwise.

  • dendrogram (bool, optional) – Whether to show dendrograms for hierarchical clustering

  • cluster_rows (bool, optional) – Whether to cluster rows (genes)

  • cluster_cols (bool, optional) – Whether to cluster columns (groups)

  • dendrogram_color (str, optional) – Color for dendrograms

  • figsize (tuple, optional) – Figure size as (width, height) in inches. If None, will be calculated based on data dimensions, cell_size, and aspect_ratio.

  • tile_aspect_ratio (float, optional) – Aspect ratio of individual tiles (width/height). Default is 1.0 (square tiles). Values > 1 create wider tiles, values < 1 create taller tiles.

  • tile_size (float, optional) – Base size in inches for each tile when automatically calculating figure size. Default is 0.5 inches. For square tiles (tile_aspect_ratio=1), this is the width and height. For non-square tiles, this is the width if tile_aspect_ratio > 1, or the height if tile_aspect_ratio < 1.cell

  • show_gene_labels (bool, optional) – Whether to show gene labels

  • show_group_labels (bool, optional) – Whether to show group labels

  • gene_labels_size (int, optional) – Font size for gene labels

  • group_labels_size (int, optional) – Font size for group labels

  • colorbar_title (str, optional) – Title for the colorbar. If None, will default to “Z-score” when any z-scoring is applied (standard_scale=”var”, standard_scale=”group”, or standard_scale=0, 1), and “Expression” otherwise.

  • colorbar_kwargs (dict, optional) – Additional parameters for colorbar customization. Supported keys include: - ‘label_kwargs’: dict with parameters for colorbar label (e.g. fontsize, color) - ‘locator’: A matplotlib Locator instance for tick positions - ‘formatter’: A matplotlib Formatter instance for tick labels - Any attribute of matplotlib colorbar instance

  • n_colorbar_ticks (int, optional) – Number of ticks to display in the colorbar. Default is 3. This parameter provides a simple way to control tick density, while the colorbar_kwargs[‘locator’] option provides more fine-grained control if needed.

  • layout_config (dict, optional) – Configuration for controlling plot layout spacing. Keys include: - ‘gene_label_space’: Space for gene labels (y-axis), default 3.5 - ‘group_label_space’: Space for group labels (x-axis), default 2.0 - ‘title_space’: Space for title, default 3.0 - ‘base_legend_space’: Base space for legend, default 4.0 - ‘legend_name_factor’: Factor to adjust legend space based on condition name length, default 0.15 - ‘colorbar_space’: Space for colorbar, default 3.0 - ‘row_dendrogram_space’: Space for row dendrogram, default 2.5 - ‘col_dendrogram_space’: Space for column dendrogram, default 2.5 - ‘legend_fontsize’: Base font size for legend, default 12 - ‘legend_fontsize_factor’: Factor to reduce font size for long condition names, default 0.25 - ‘colorbar_height’: Height proportion of sidebar for colorbar, default 0.5 - ‘colorbar_width’: Width proportion for colorbar, default 0.25

  • title (str, optional) – Title for the heatmap

  • sort_genes (bool, optional) – Whether to sort genes by score

  • vcenter (float or str, optional) – Value to center the colormap at. If None and any z-scoring is applied (standard_scale=’var’, ‘group’, 0, or 1), the colormap will be centered at 0. If None and no z-scoring is applied, a standard (non-centered) colormap will be used. Can be specified as a percentile using ‘p<number>’ format (e.g., ‘p50’ for median).

  • vmin (float or str, optional) – Minimum value for colormap. If None and z-scoring is applied, will use a symmetric limit based on the maximum absolute value of the data. Can be specified as a percentile using ‘p<number>’ format (e.g., ‘p5’ for 5th percentile).

  • vmax (float or str, optional) – Maximum value for colormap. If None and z-scoring is applied, will use a symmetric limit based on the maximum absolute value of the data. Can be specified as a percentile using ‘p<number>’ format (e.g., ‘p95’ for 95th percentile).

  • ax (matplotlib.axes.Axes, optional) – Axes to plot on. If None, creates new figure

  • draw_values (bool) – Whether to draw the values in the heatmap cells. Default is False.

  • return_fig (bool, optional) – If True, returns the figure and axes

  • return_data (bool, optional) – If True, returns the expression means and fold-changes used for the heatmap

  • save (str, optional) – Path to save figure. If None, figure is not saved

  • run_id (int, optional) – Specific run ID to use for fetching field names from run history. -1 (default) is the latest run.

  • condition_column (str, optional) – Column in adata.obs containing condition information. If None, tries to infer from run_info.

  • observed (bool, optional) – Whether to use only observed combinations in groupby operations.

  • condition1 (str, optional) – Names of the two conditions to compare. If None, tries to infer from run_info. These must match the values in the condition_column in adata.obs.

  • condition2 (str, optional) – Names of the two conditions to compare. If None, tries to infer from run_info. These must match the values in the condition_column in adata.obs.

  • condition1_name (str, optional) – Display names for the two conditions in the plot legend and title. If None, defaults to the values of condition1 and condition2.

  • condition2_name (str, optional) – Display names for the two conditions in the plot legend and title. If None, defaults to the values of condition1 and condition2.

  • exclude_groups (str or list, optional) – Group name(s) to exclude from the heatmap.

  • fold_change_mode (bool, optional) – Whether to use fold change coloring instead of split tiles

  • split_dot_mode (bool, optional) – Whether to use split dots instead of split tiles. When True, the size of each half-dot represents the number of cells in that condition for that group

  • max_cell_count (int, optional) – Upper limit for cell count used for dot sizing. If provided, all dots will be scaled relative to this maximum value, even if actual cell counts exceed it. This helps maintain readable visualization when some groups have much larger cell counts than others.

  • **kwargs – Additional keyword arguments passed to matplotlib

Returns:

  • If return_fig is True and dendrogram is False, returns (fig, ax)

  • If return_fig is True and dendrogram is True, returns (fig, ax, dendrogram_axes)

Direction Plots

kompot.plot.direction_barplot(adata: AnnData, category_column: str, direction_column: str | None = None, condition1: str | None = None, condition2: str | None = None, normalize: Literal['index', 'columns', None] = 'index', figsize: Tuple[float, float] = (12, 6), title: str | None = None, xlabel: str | None = None, ylabel: str | None = None, colors: Dict[str, str] | None = None, rotation: float = 90, legend_title: str = 'Direction', legend_loc: str = 'best', stacked: bool = True, sort_by: str | None = None, ascending: bool = False, category_order: List[str] | None = None, ax: Axes | None = None, return_fig: bool = False, save: str | None = None, run_id: int = -1, **kwargs) Figure | NoneView on GitHub

Create a barplot showing the direction of change distribution across categories.

This function creates a stacked or grouped barplot showing the distribution of up/down/neutral changes across different categories (like cell types).

Parameters:
  • adata (AnnData) – AnnData object containing differential abundance results

  • category_column (str) – Column in adata.obs to use for grouping (e.g., “cell_type”)

  • direction_column (str, optional) – Column in adata.obs containing direction information. If None, will try to infer from the run specified by run_id.

  • condition1 (str, optional) – Name of condition 1 (denominator in fold change). If None, will try to infer from the run_id.

  • condition2 (str, optional) – Name of condition 2 (numerator in fold change). If None, will try to infer from the run_id.

  • normalize (str or None, optional) – How to normalize the data. Options: “index” (normalize rows), “columns” (normalize columns), or None (raw counts).

  • figsize (tuple, optional) – Figure size as (width, height) in inches

  • title (str, optional) – Plot title. If None and conditions provided, uses “Direction of Change by {category_column}n{condition1} to {condition2}”

  • xlabel (str, optional) – Label for x-axis. If None, uses the category_column

  • ylabel (str, optional) – Label for y-axis. Defaults to “Percentage (%)” when normalize=”index”, otherwise “Count”

  • colors (dict, optional) – Dictionary mapping direction values to colors.

  • rotation (float, optional) – Rotation angle for x-tick labels

  • legend_title (str, optional) – Title for the legend

  • legend_loc (str, optional) – Location for the legend

  • stacked (bool, optional) – Whether to create a stacked (True) or grouped (False) bar plot

  • sort_by (str, optional) – Direction category to sort by (e.g., “up”, “down”). If None, uses the order in the data

  • ascending (bool, optional) – Whether to sort in ascending order.

  • category_order (list of str, optional) – Specific categories and their order to display. Defaults to data order.

  • ax (matplotlib.axes.Axes, optional) – Axes to plot on. If None, creates new figure

  • return_fig (bool, optional) – If True, returns the figure and axes

  • save (str, optional) – Path to save figure. If None, figure is not saved

  • run_id (int, optional) – Specific run ID to use for fetching data from run history. Negative indices count from the end (-1 is the latest run).

Returns:

If return_fig is True, returns (fig, ax)

Return type:

tuple or None

Smoothing Plots

kompot.plot.plot_smoothing(adata, genes: List[str] | None = None, n_top_genes: int = 6, basis: str = 'X_umap', result_key: str = 'kompot_smooth', condition: str | None = None, layer: str | None = None, show_obs_variance: bool = True, cmap: str = 'Spectral_r', cmap_std: str = 'magma', figsize_per_panel: tuple = (3.5, 3.0), title: str | None = None, save: str | None = None, return_fig: bool = False, **kwargs) Figure | NoneView on GitHub

Plot raw vs. GP-smoothed expression with uncertainty.

Shows a grid with rows: 1. Raw expression 2. GP-smoothed expression 3. Epistemic std (GP posterior, shared across genes) or total std if no obs_variance 4. Aleatoric std (sqrt of obs_variance, per-gene) – only if available

Uses scanpy.pl.embedding internally when available, falling back to manual scatter plots otherwise.

Parameters:
  • adata (AnnData) – AnnData object with smoothing results.

  • genes (list of str, optional) – Genes to plot. If None, selects top genes by mean smoothed value.

  • n_top_genes (int) – Number of genes when genes is None. Max 8.

  • basis (str) – Key in adata.obsm for 2-D coordinates.

  • result_key (str) – Base key used in smooth_expression().

  • condition (str, optional) – Condition label in the layer names. If None, auto-detected from available layers.

  • layer (str, optional) – Layer with raw expression. If None, uses adata.X.

  • show_obs_variance (bool) – Show obs_variance row if available.

  • cmap (str) – Colormap for expression values.

  • cmap_std (str) – Colormap for uncertainty panels.

  • figsize_per_panel (tuple) – Size of each subplot (width, height).

  • title (str, optional) – Overall figure title.

  • return_fig (bool) – If True, return the Figure instead of calling plt.show().

  • **kwargs – Extra keyword arguments forwarded to scanpy.pl.embedding (e.g. s, size, vmin, vmax).

Return type:

Figure or None

Embedding Plots

kompot.plot.embedding(adata: AnnData, basis: str, groups: Dict[str, str | List[str]] | str | List[str] | None = None, background_color: str | None = 'lightgrey', matplotlib_scatter_kwargs: Dict[str, Any] | None = None, mgroups: List[Dict[str, str | List[str]]] | Dict[str, Dict[str, str | List[str]]] | None = None, ncols: int | None = None, save: str | None = None, return_fig: bool = False, **kwargs) Figure | NoneView on GitHub

Plot embeddings with group filtering capabilities.

This function wraps scanpy’s plotting.embedding function but adds the ability to filter cells based on observation column values. Selected cells are plotted normally using scanpy, while non-selected cells can be displayed in a different color in the background.

Parameters:
  • adata (AnnData) – AnnData object containing the embedding coordinates.

  • basis (str) – Key for the embedding coordinates. Same as scanpy’s basis parameter.

  • groups (Dict[str, Union[str, List[str]]] or str or List[str], optional) – If a dictionary: keys are column names in adata.obs and values are lists or individual allowed values. Only cells matching ALL conditions will be highlighted. If a string: Same as scanpy’s groups parameter for categorical groupby. If None: all cells are shown normally.

  • background_color (str, optional) – Color for non-selected cells. If None, background cells are not shown. Default is “lightgrey”.

  • matplotlib_scatter_kwargs (Dict[str, Any], optional) – Additional keyword arguments to pass to matplotlib’s scatter function when plotting background cells. Common options include ‘alpha’, ‘s’ (size), ‘edgecolors’, and ‘zorder’. Defaults match scanpy’s styling with {‘zorder’: 0, ‘edgecolors’: ‘none’, ‘linewidths’: 0, ‘alpha’: 0.7}.

  • mgroups (List[Dict[str, Union[str, List[str]]]] or Dict[str, Dict[str, Union[str, List[str]]]], optional) – List or dictionary of groups dictionaries to create multiple panels. Each element is treated as a separate groups argument in its own subplot. Cannot be used with multiple colors. If provided as a list, title argument should align with the number of groups in mgroups. If provided as a dictionary, the keys will be used as title names unless titles is explicitly provided. If titles is provided but too short, a warning will be issued and the dictionary keys will be used for the remaining panels. Cannot be used when layer is a list.

  • ncols (int, optional) – Number of columns for panel layout when using mgroups or when layer, or color is a list. Default is 4 or less depending on the number of panels.

  • **kwargs

    All other parameters are passed directly to scanpy.pl.embedding. See scanpy.pl.embedding documentation for details on available parameters.

    When layer is a list, each layer is plotted in a separate panel (only when color is not a list and mgroups is not used).

Returns:

  • Whatever scanpy.pl.embedding returns based on your kwargs.

  • If return_fig=True, returns the figure or (figure, axes) depending on scanpy version.

  • Otherwise returns None.

Notes

This function requires scanpy. If scanpy is not available, it will raise a warning. See scanpy.pl.embedding documentation for full details of base plotting parameters.

StringDB Integration

The kompot.plot.StringDBReport class provides tools to generate gene set reports with the StringDB network and resource links.

class kompot.plot.StringDBReport(genes: List[str], species_id: int = 9606, include_stringdb: bool = True, include_resources: bool = True, include_enrichment: bool = False)View on GitHub

Generate rich gene set reports with StringDB integration.

This class provides tools to generate rich HTML reports for gene sets, including StringDB network visualization, resource links, and other gene information. It’s designed to work well in Jupyter notebooks but can also be used programmatically.

Parameters:
  • genes (List[str]) – List of gene symbols to include in the report

  • species_id (int, optional) – NCBI taxonomy ID for species (default: 9606 for Homo sapiens)

  • include_stringdb (bool, optional) – Include StringDB network image and links (default: True)

  • include_resources (bool, optional) – Include external resource links for genes (default: True)

  • include_enrichment (bool, optional) – Include functional enrichment analysis (default: False)

genes

List of gene symbols included in the report

Type:

List[str]

species_id

NCBI taxonomy ID for the species

Type:

int

string_db_base_url

Base URL for StringDB API and web interface

Type:

str

Notes

Supported species IDs and their names:

Species ID

Species Name

9606

Homo sapiens

10090

Mus musculus

10116

Rattus norvegicus

7227

Drosophila melanogaster

6239

Caenorhabditis elegans

4932

Saccharomyces cerevisiae

3702

Arabidopsis thaliana

Additional species IDs can be used but won’t have mapped names in the report. For the full list of available species, see the StringDB website.

display(additional_genes: List[str] | None = None) NoneView on GitHub

Display the report in a Jupyter notebook.

Parameters:

additional_genes (List[str], optional) – Additional genes to include in the StringDB visualizations

fetch_stringdb_image(additional_genes: List[str] | None = None) bytes | NoneView on GitHub

Fetch StringDB network image as bytes.

Parameters:

additional_genes (List[str], optional) – Additional genes to include in the StringDB image

Returns:

Image bytes or None if fetch failed

Return type:

Optional[bytes]

get_functional_enrichment(category: str = 'Process', fdr_threshold: float = 0.05) DataFrame | NoneView on GitHub

Get functional enrichment analysis for the gene set.

This method fetches functional enrichment results through StringDB’s enrichment API.

Parameters:
  • category (str, optional) – Category for enrichment analysis (default: “Process”) Valid options: - Process: Gene Ontology biological processes - Component: Gene Ontology cellular components - Function: Gene Ontology molecular functions - KEGG: KEGG pathways - Pfam: Protein domain annotations from Pfam - InterPro: Protein domain annotations from InterPro - SMART: Protein domain annotations from SMART - Keywords: UniProt keyword annotations - Reactome: Reactome pathway annotations - WikiPathways: WikiPathways annotations

  • fdr_threshold (float, optional) – FDR threshold for significance (default: 0.05)

Returns:

DataFrame with enrichment results or None if request failed

Return type:

Optional[pd.DataFrame]

Notes

The enrichment results include various columns depending on the category: - term: Identifier for the enriched term (e.g., GO:0006281) - description: Human-readable description of the term - signal: Balanced metric combining enrichment magnitude and significance (higher is better) - strength: Log10(observed/expected) indicating enrichment effect size - fdr: False discovery rate (adjusted p-value) - number_of_genes: Number of genes from the input that match this term - inputGenes: List of input genes that match this term

Results are sorted by signal (descending) following StringDB’s default behavior. Different categories have different levels of annotation coverage. For example, GO Process usually provides the most annotations, while specific pathway databases may have more limited coverage.

get_json(additional_genes: List[str] | None = None) Dict[str, Any]View on GitHub

Generate a JSON representation of the gene report.

Parameters:

additional_genes (List[str], optional) – Additional genes to include in the StringDB visualizations

Returns:

JSON-serializable dictionary with report data

Return type:

Dict[str, Any]

get_resource_links(gene: str) Dict[str, str]View on GitHub

Generate external resource links for a gene.

Parameters:

gene (str) – Gene symbol to generate links for

Returns:

Dictionary mapping resource names to URLs

Return type:

Dict[str, str]

get_species_name() strView on GitHub

Get human-readable species name from species ID.

get_stringdb_image_url(additional_genes: List[str] | None = None) strView on GitHub

Generate URL for StringDB network image.

Parameters:

additional_genes (List[str], optional) – Additional genes to include in the StringDB image

Returns:

URL for StringDB network image

Return type:

str

get_stringdb_url(additional_genes: List[str] | None = None) strView on GitHub

Generate URL for StringDB network visualization.

Parameters:

additional_genes (List[str], optional) – Additional genes to include in the StringDB query

Returns:

URL for StringDB network visualization

Return type:

str

save_html(filename: str, additional_genes: List[str] | None = None) NoneView on GitHub

Save the report as an HTML file.

Parameters:
  • filename (str) – Path to save the HTML file

  • additional_genes (List[str], optional) – Additional genes to include in the StringDB visualizations

to_dataframe() DataFrameView on GitHub

Convert gene resource links to a pandas DataFrame.

Returns:

DataFrame with genes as index and resource links as columns

Return type:

pd.DataFrame