boltzkit.evaluation.sample_based.histogram_comparison

Functions

`get_histogram_fwd_kullback_leibler`(hist_p, ...)	Compute the forward Kullback-Leibler divergence between two histograms.
`get_histogram_jensen_shannon_divergence`(...)	Compute the Jensen-Shannon divergence between two histograms.
`get_histogram_metrics`(hist_metrics, true, ...)	Evaluate multiple histogram metrics on paired histograms.
`get_histogram_total_variation_distance`(...)	Compute the total variation distance between two histograms.

Classes

HistogramMetric

Protocol for histogram distance or divergence metrics.

class boltzkit.evaluation.sample_based.histogram_comparison.HistogramMetric[source]

Bases: Protocol

Protocol for histogram distance or divergence metrics.

A histogram metric is a callable object that compares two histograms with identical binning and returns a scalar score.

Implementations must also define an id property used for logging and aggregation.

property id: str

__init__(*args, **kwargs)

boltzkit.evaluation.sample_based.histogram_comparison.get_histogram_fwd_kullback_leibler(hist_p: T, hist_q: T)[source]

Compute the forward Kullback-Leibler divergence between two histograms.

The divergence is defined as:

\[KL(P \parallel Q) = \int P(x) \log \frac{P(x)}{Q(x)} dx\]

and approximated using a discrete sum over histogram bins.

A small constant is added for numerical stability.

Parameters:

hist_p (Histogram1D or Histogram2D) – Reference distribution \(P\).
hist_q (Histogram1D or Histogram2D) – Approximation distribution \(Q\). Must share identical binning.

Returns:

Estimated KL divergence \(KL(P \parallel Q)\).

Return type:

float

Notes

Histograms must have identical bin structure.
The bin area is used to approximate the continuous integral.

boltzkit.evaluation.sample_based.histogram_comparison.get_histogram_total_variation_distance(hist_p: T, hist_q: T)[source]

Compute the total variation distance between two histograms.

Defined as:

\[TV(P, Q) = \frac{1}{2} \int |P(x) - Q(x)| dx\]

Approximated via summation over histogram bins.

Parameters:

hist_p (Histogram1D or Histogram2D) – First distribution \(P\).
hist_q (Histogram1D or Histogram2D) – Second distribution \(Q\). Must share identical binning.

Returns:

Total variation distance in \([0, 1]\).

Return type:

float

Notes

Requires identical binning.
Uses bin area for continuous approximation.

boltzkit.evaluation.sample_based.histogram_comparison.get_histogram_jensen_shannon_divergence(hist_p: T, hist_q: T)[source]

Compute the Jensen-Shannon divergence between two histograms.

Defined as:

\[JSD(P, Q) = \frac{1}{2} KL(P \parallel M) + \frac{1}{2} KL(Q \parallel M)\]

where:

\[M = \frac{1}{2}(P + Q)\]

Parameters:

hist_p (Histogram1D or Histogram2D) – First distribution \(P\).
hist_q (Histogram1D or Histogram2D) – Second distribution \(Q\). Must share identical binning.

Returns:

Jensen-Shannon divergence (non-negative, symmetric).

Return type:

float

Notes

Histograms must have identical binning.
A small epsilon is used for numerical stability.

boltzkit.evaluation.sample_based.histogram_comparison.get_histogram_metrics(hist_metrics: list[HistogramMetric], true: list[T], pred: list[T], *, group: str, h_type: str | None = None, include_individual: bool = True, include_aggregated: bool = True)[source]

Evaluate multiple histogram metrics on paired histograms.

Each metric is applied to corresponding pairs from true and pred. Results are returned as a flat dictionary with structured keys.

Keys follow the format:

{group}/{h_type}_{metric_id}_{i}
{group}/{metric_id}_{i}          (if h_type is None)

Aggregated statistics (mean over histogram pairs) are optionally included:

{group}/{h_type}_{metric_id}_mean
{group}/{metric_id}_mean

Parameters:

hist_metrics (list[HistogramMetric]) – Metrics to evaluate.
true (list[Histogram1D or Histogram2D]) – Ground-truth histograms.
pred (list[Histogram1D or Histogram2D]) – Predicted histograms (same length as true).
group (str) – Metric group name (e.g., "torsion").
h_type (str, optional) – Histogram type identifier (e.g., "phi_psi").
include_individual (bool, default=True) – Whether to include per-histogram metric values.
include_aggregated (bool, default=True) – Whether to include mean values across histogram pairs.

Returns:

Dictionary of computed metrics.

Return type:

dict[str, float]

Notes

true and pred must have equal length.
Metrics are evaluated pairwise using zip.