Grid-Based Inference

BESTA includes a direct model-grid inference workflow that does not require CosmoSIS samplers. This mode is useful when you already have a finite model library and want fast posterior evaluation over that library.

Why use the grid module instead of CosmoSIS sampling?

Use the grid module when:

  • Your model space is already discretized (for example: precomputed SED/SSP libraries).

  • You want robust, repeatable inference with no MCMC tuning.

  • You need high throughput over many objects using candidate selection and parallel workers.

  • You want direct control over priors/likelihoods in Python.

Prefer the CosmoSIS pipeline (see Pipeline Manager) when:

  • Your parameter space is continuous and not naturally represented by a fixed grid.

  • You need sampler diagnostics and chain-level convergence checks.

  • You rely on existing module wiring in the CosmoSIS runtime.

Core Concepts

Minimal Workflow

  1. Build or load a ModelGrid.

  2. Create a GridFitter with a likelihood and prior.

  3. Evaluate posterior summaries for one object, or run fit_batch() for many.

Example (single-object posterior on one target):

import numpy as np
from besta.grid import ModelGrid, GridFitter
from besta.grid.prob import GaussianProductLikelihood, FlatPrior

# N models, P observables, Q targets
grid = ModelGrid(
    observables=obs_models,                # shape (N, P)
    targets=target_models,                 # shape (N, Q)
    observable_names=["mag_g", "mag_r", "mag_i"],
    target_names=["logM", "age", "Z", "z"],
)

fitter = GridFitter(
    grid=grid,
    likelihood=GaussianProductLikelihood(),
    prior=FlatPrior(),
    use_standardised=True,
)

x = np.array([22.1, 21.5, 21.2])          # observed data (P,)
sx = np.array([0.03, 0.03, 0.04])         # observational errors (P,)
bins = np.linspace(7.0, 12.0, 101)        # bins for logM

post_logM, centers = fitter.posterior_over_target(
    x_native=x,
    sigma_native=sx,
    target_col="logM",
    bins=bins,
)

Batch Inference (many objects)

For catalogs, use fit_batch(). It supports:

  • optional candidate selection (binner=),

  • thread/process parallelism (n_jobs, backend),

  • posterior truncation controls (posterior_keep_mass, etc.),

  • optional per-target summary statistics (stats_for, stats_bins),

  • optional HDF5 output (output_hdf5_path).

from besta.grid.binning import KDTreeBinner

binner = KDTreeBinner(dims=[0, 1, 2]).fit(grid)  # select candidates in observable space

results = fitter.fit_batch(
    X_native=X_catalog,                    # shape (M, P)
    SIG_native=SIG_catalog,                # shape (M, P)
    binner=binner,
    n_jobs=8,
    backend="thread",
    stats_for=["logM", "age"],
    stats_bins=[np.linspace(7, 12, 120), np.linspace(0, 14, 120)],
    output_hdf5_path="grid_fit_results.h5",
    output_hdf5_group="/run1",
    return_mode="iter",
)

for r in results:
    # r contains: m, candidates, post_models, truncation, and optional stats
    pass

Priors and Likelihoods

The grid module is fully Bayesian; you choose the ingredients from besta.grid.prob:

This gives a similar statistical structure to sampling methods, but evaluated directly on a finite model set instead of drawing chains.

Input/Output and Reproducibility

ModelGrid supports multiple formats:

Practical Tips

  • Start with FlatPrior + GaussianProductLikelihood as a baseline.

  • If the grid is large, use a binner to cut candidate counts before posterior evaluation.

  • Use return_mode=”iter” for low-memory streaming over large catalogs.

  • Write batch outputs to HDF5 for reproducible downstream post-processing.