API Reference

Package

class besta.MainPipeline(pipeline_configuration_list, n_cores_list=None, ini_files=None, ini_values_files=None)[source]

BESTA Pipeline manager.

pipelines_config

List of dictionaries containing the configuration parameters for each: subpipeline.

Type:: list

n_cores_list

List containing the number of cores to be used on each run. If None, every subpipeline will use one single core during runtime.

Type:: list, optional, default=None

ini_files

List of .ini filenames

Type:: list

ini_values_files

List of files containing the priors associated to ini_files.

Type:: list

run_command(command)[source]: Execute a shell command and return its process exit code.

execute_pipeline(config, n_cores, ini_filename=None, ini_values_filename=None)[source]

Execute a sub-pipeline.

Parameters:

config (dict) – Dictionary containing the configuration parameters for setting up the subpipeline.
n_cores (int) – Number of cores to use during runtime.
ini_filename (str, optional, default=None) – If provided, this file is used to run cosmosis.
ini_values_filename (str, optional, default=None) – If provided, use this file to set the prior values.

execute_all(plot_result=False)[source]: Execute all sub-pipelines.

class besta.Reader(ini_file=None, results_file=None)[source]

CosmoSIS run results reader.

This class is meant to improve the accessibility to the results from a CosmoSIS run.

The Reader can be initialised either from a .ini conguration file or from the file containing the results of the run (which implicitly includes the configuration information in the header).

Currently, there are two factory methods to initialise the Reader.

>>> from besta.io import Reader
>>> reader = Reader.from_ini_file(path_to_ini_config_file)
>>> reader = Reader.from_results_file(path_to_results_file)

From here, users may want to load the results of the run into a table:

>>> reader.load_results()
>>> reader.results_table.info

The Reader also provides tools for extracting some solutions among the full run. For instance, users may be interested on using the solution that maximizes the likelihood (i.e., the best-fit solution):

>>> best_fit = reader.get_maxlike_solution(log_prob="post", as_datablock=True)

The first argument tells the function to use the row with the highest value of "post" (the column naming convention depends on the CosmoSIS sampler used).

Alternatively, users might be interested in selecting a set of solutions based on their posterior probability or likelihood. For that purpose, the Reader includes two methods that allow to pull a subset of the solutions:

Users can retrieve a fraction of the solutions with the highest posterior by calling:
```
>>> reader.get_top_frac_solutions(frac=1)
```

which will return the first top percent of all the solutions.

Users can retrieve a fraction of the solutions that accounts for a given fraction of the cumulative posterior.
```
>>> reader.get_pct_solutions(pct=99)
```

which will return the solutions that account for 99% of the total posterior.

Besides selecting solutions, the Reader also provides access to the modules used during the sampling, allowing to recompute the observable quantities or to evaluate the fit:

>>> module = reader.last_module
>>> flux_model = module.make_observable(best_fit, parse=True)

property values_file: str: Path to the CosmoSIS (prior) values configuration file.

property modules: list: List of modules used in the pipeline as specified in the ini file.

property module_names: list: List of module names used in the pipeline.

get_module(module_name)[source]

Get a pipeline module by name.

Parameters:: module_name (str) – Name of the module to retrieve.
Returns:: module (instance) – An instance of the requested pipeline module.

property last_module: An instance of the last pipeline module used in the run.

property results_table: Table: Table containing the CosmoSIS run results.

property ini_file: str: Path to the CosmoSIS configuration file.

property results_file: str: Path to the file containing the CosmoSIS run results.

property ini: dict: Cosmosis .ini configuration file

property ini_values: dict: Cosmosis configuration values.

property config: dict: Pipeline module configuration.

load_results()[source]: Load the cosmosis run results associated to the ini file.

get_maxlike_solution(log_prob='post', as_datablock=False, **kwargs)[source]

Get the maximum likelihood solution.

Obtain the maximum likelihood solution from the results_table

Parameters:

log_prob (str, optional) – Column name to use for computing the maximum likelihood. Default is post.
as_datablock (bool, optional) – If True, the solution is converted to a DataBlock.
**kwargs – Additional arguments for converting the solution into a DataBlock.

Returns:

solution (dict) – Dictionary containing the solution with the maximum likelihood.

See also

solution_to_datablock()

get_top_frac_solutions(frac=1, log_prob='post', as_datablock=False, **kwargs)[source]

Return a fraction of the solutions sorted by their posterior probability.

This method provides a given fraction of the solutions with the highest posterior probability.

Parameters:

frac (float, optional) – Percentage of the solutions to return, e.g. frac=10 will return the top 10 per cent with the highest probability.
log_prob (str, optional) – Column name to use for computing the maximum likelihood. Default is post.
as_datablock (bool, optional) – If True, the solution is converted to a DataBlock.
**kwargs – Additional arguments for converting the solution into a DataBlock.

Returns:

all_solutions (list) – A list containing the solutions.

See also

solution_to_datablock()

get_pct_solutions(pct=99, log_prob='post', as_datablock=False, **kwargs)[source]

Return the top percentile solutions.

This method sorts all solutions based on their posterior probablity and returns a given fraction.

Parameters:

pct (float, optional) – Fraction of the solutions to return, e.g. pct=99 will return the top 1 per cent with the highest probability.
log_prob (str, optional) – Column name to use for computing the maximum likelihood. Default is post.
as_datablock (bool, optional) – If True, the solution is converted to a DataBlock.
**kwargs – Additional arguments for converting the solution into a DataBlock.

Returns:

all_solutions (list) – A list containing the solutions.

See also

solution_to_datablock()

solution_to_dict(sample, *, add_fixed=True, extra_params=None)[source]: TODO

solution_to_datablock(solution: dict, add_fixed=True, extra_params=None)[source]

Convert a solution into a DataBlock.

Parameters:

solution (dict-like) – A dictionary-like containing the parameter values.
add_fixed (bool, optional) – If True, add the default values of the fixed parameters from the ini values config file. Default is True.
extra_params (iterable, optional) – An iterable containing pairs (section, name) of additional parameters contained in solution to be included in the datablock.

Returns:

datablock (DataBlock) – The DataBlock containing the input solution.

classmethod read_ini_file(path)[source]

Read the cosmosis configuration .ini file.

Parameters:: path (str) – Path to the file.
Returns:: ini (dict) – Dictionary containing the information from the ini file.

classmethod read_ini_file_from_results(path)[source]

Read the embedded CosmoSIS ini block from a results file.

Parameters:: path (str) – Path to the results file containing the ini block.
Returns:: dict – Parsed ini configuration stored in the results header.

classmethod from_ini_file(path_to_ini)[source]: Create a reader from a CosmoSIS ini file.

classmethod from_results_file(path_to_results)[source]: Create a reader from a CosmoSIS results file.

CosmoSIS-based Inference

Core modules

Global BESTA configuration loaded from YAML.

The configuration file path is resolved from the besta_config environment variable, defaulting to besta-config.yml shipped with the package.

This module exposes configured objects/dictionaries at import time: - cosmology - kinematics - extinction - memory

Central logging configuration for BESTA.

besta.logging.setup_logging(level: str = 'INFO', log_file: str | None = None, console: bool = True, overwrite: bool = False)[source]

Configure the root BESTA logger.

Parameters:

level (str) – Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL).
log_file (str, optional) – If provided, logs are written to this file.
overwrite (bool) – If True, overwrite log_file instead of appending.

besta.logging.get_logger(module_name: str) → Logger[source]: Return a child logger under the besta namespace.

Module dedicated to read and manipulate the output products of BESTA.

besta.io._split_top_level(s: str, *, split_on_whitespace=True) → list[str][source]: Split a string into top-level tokens, respecting quotes and nested parentheses/brackets.

besta.io._split_tokens(s: str) → list[str][source]: Split a string into tokens, respecting quotes and nested parentheses/brackets.

besta.io._is_wrapped_group(token: str) → bool[source]: Check if a token is a wrapped group like (1, 2, 3) or [1, 2, 3].

besta.io._sequence_to_numpy(values, *, force_sequence=False)[source]: Coerce a list of values into a numpy array if all elements are numeric, otherwise keep as list.

besta.io.string_to_func_args(text: str)[source]

Parse a string of the form ‘arg1, arg2, key=value’ into separate args and kwargs.

Parameters:

text (str) – Input string containing positional and keyword arguments.

Returns:

args (list) – List of positional arguments.
kwargs (dict) – Dictionary of keyword arguments.

besta.io.make_ini_file(filename, config, ignore_sec='values')[source]

Create a .ini file from an input configuration.

Parameters:

filename (str) – Output file name.
config (dict) – Dictionary containing the configuration parameters.

besta.io.make_values_file(config, overwrite=True, values_sec='values')[source]

Make a values.ini file from the configuration.

Parameters:: config (dict) – Configuration parameters

besta.io.read_results_file(path)[source]

Read the results produced during a CosmoSIS run.

Parameters:: path (str) – Path to the file containing the cosmosis results
Returns:: table (astropy.table.Table) – Table containing the results.

besta.io.load_class_from_path(file_path, class_name)[source]: Load a class object from a Python file path at runtime.

class besta.io.Reader(ini_file=None, results_file=None)[source]

Bases: object

CosmoSIS run results reader.

This class is meant to improve the accessibility to the results from a CosmoSIS run.

The Reader can be initialised either from a .ini conguration file or from the file containing the results of the run (which implicitly includes the configuration information in the header).

Currently, there are two factory methods to initialise the Reader.

>>> from besta.io import Reader
>>> reader = Reader.from_ini_file(path_to_ini_config_file)
>>> reader = Reader.from_results_file(path_to_results_file)

From here, users may want to load the results of the run into a table:

>>> reader.load_results()
>>> reader.results_table.info

The Reader also provides tools for extracting some solutions among the full run. For instance, users may be interested on using the solution that maximizes the likelihood (i.e., the best-fit solution):

>>> best_fit = reader.get_maxlike_solution(log_prob="post", as_datablock=True)

The first argument tells the function to use the row with the highest value of "post" (the column naming convention depends on the CosmoSIS sampler used).

Alternatively, users might be interested in selecting a set of solutions based on their posterior probability or likelihood. For that purpose, the Reader includes two methods that allow to pull a subset of the solutions:

Users can retrieve a fraction of the solutions with the highest posterior by calling:
```
>>> reader.get_top_frac_solutions(frac=1)
```

which will return the first top percent of all the solutions.

Users can retrieve a fraction of the solutions that accounts for a given fraction of the cumulative posterior.
```
>>> reader.get_pct_solutions(pct=99)
```

which will return the solutions that account for 99% of the total posterior.

Besides selecting solutions, the Reader also provides access to the modules used during the sampling, allowing to recompute the observable quantities or to evaluate the fit:

>>> module = reader.last_module
>>> flux_model = module.make_observable(best_fit, parse=True)

property values_file: str: Path to the CosmoSIS (prior) values configuration file.

property modules: list: List of modules used in the pipeline as specified in the ini file.

property module_names: list: List of module names used in the pipeline.

get_module(module_name)[source]

Get a pipeline module by name.

Parameters:: module_name (str) – Name of the module to retrieve.
Returns:: module (instance) – An instance of the requested pipeline module.

property last_module: An instance of the last pipeline module used in the run.

property results_table: Table: Table containing the CosmoSIS run results.

property ini_file: str: Path to the CosmoSIS configuration file.

property results_file: str: Path to the file containing the CosmoSIS run results.

property ini: dict: Cosmosis .ini configuration file

property ini_values: dict: Cosmosis configuration values.

property config: dict: Pipeline module configuration.

load_results()[source]: Load the cosmosis run results associated to the ini file.

get_maxlike_solution(log_prob='post', as_datablock=False, **kwargs)[source]

Get the maximum likelihood solution.

Obtain the maximum likelihood solution from the results_table

Parameters:

log_prob (str, optional) – Column name to use for computing the maximum likelihood. Default is post.
as_datablock (bool, optional) – If True, the solution is converted to a DataBlock.
**kwargs – Additional arguments for converting the solution into a DataBlock.

Returns:

solution (dict) – Dictionary containing the solution with the maximum likelihood.

See also

solution_to_datablock()

get_top_frac_solutions(frac=1, log_prob='post', as_datablock=False, **kwargs)[source]

Return a fraction of the solutions sorted by their posterior probability.

This method provides a given fraction of the solutions with the highest posterior probability.

Parameters:

frac (float, optional) – Percentage of the solutions to return, e.g. frac=10 will return the top 10 per cent with the highest probability.
log_prob (str, optional) – Column name to use for computing the maximum likelihood. Default is post.
as_datablock (bool, optional) – If True, the solution is converted to a DataBlock.
**kwargs – Additional arguments for converting the solution into a DataBlock.

Returns:

all_solutions (list) – A list containing the solutions.

See also

solution_to_datablock()

get_pct_solutions(pct=99, log_prob='post', as_datablock=False, **kwargs)[source]

Return the top percentile solutions.

This method sorts all solutions based on their posterior probablity and returns a given fraction.

Parameters:

pct (float, optional) – Fraction of the solutions to return, e.g. pct=99 will return the top 1 per cent with the highest probability.
log_prob (str, optional) – Column name to use for computing the maximum likelihood. Default is post.
as_datablock (bool, optional) – If True, the solution is converted to a DataBlock.
**kwargs – Additional arguments for converting the solution into a DataBlock.

Returns:

all_solutions (list) – A list containing the solutions.

See also

solution_to_datablock()

solution_to_dict(sample, *, add_fixed=True, extra_params=None)[source]: TODO

solution_to_datablock(solution: dict, add_fixed=True, extra_params=None)[source]

Convert a solution into a DataBlock.

Parameters:

solution (dict-like) – A dictionary-like containing the parameter values.
add_fixed (bool, optional) – If True, add the default values of the fixed parameters from the ini values config file. Default is True.
extra_params (iterable, optional) – An iterable containing pairs (section, name) of additional parameters contained in solution to be included in the datablock.

Returns:

datablock (DataBlock) – The DataBlock containing the input solution.

classmethod read_ini_file(path)[source]

Read the cosmosis configuration .ini file.

Parameters:: path (str) – Path to the file.
Returns:: ini (dict) – Dictionary containing the information from the ini file.

classmethod read_ini_file_from_results(path)[source]

Read the embedded CosmoSIS ini block from a results file.

Parameters:: path (str) – Path to the results file containing the ini block.
Returns:: dict – Parsed ini configuration stored in the results header.

classmethod from_ini_file(path_to_ini)[source]: Create a reader from a CosmoSIS ini file.

classmethod from_results_file(path_to_results)[source]: Create a reader from a CosmoSIS results file.

This module contains the pipeline manager to concatenate multiple modules.

class besta.pipeline.MainPipeline(pipeline_configuration_list, n_cores_list=None, ini_files=None, ini_values_files=None)[source]

Bases: object

BESTA Pipeline manager.

pipelines_config

List of dictionaries containing the configuration parameters for each: subpipeline.

Type:: list

n_cores_list

List containing the number of cores to be used on each run. If None, every subpipeline will use one single core during runtime.

Type:: list, optional, default=None

ini_files

List of .ini filenames

Type:: list

ini_values_files

List of files containing the priors associated to ini_files.

Type:: list

run_command(command)[source]: Execute a shell command and return its process exit code.

execute_pipeline(config, n_cores, ini_filename=None, ini_values_filename=None)[source]

Execute a sub-pipeline.

Parameters:

config (dict) – Dictionary containing the configuration parameters for setting up the subpipeline.
n_cores (int) – Number of cores to use during runtime.
ini_filename (str, optional, default=None) – If provided, this file is used to run cosmosis.
ini_values_filename (str, optional, default=None) – If provided, use this file to set the prior values.

execute_all(plot_result=False)[source]: Execute all sub-pipelines.

Pipeline Modules

Pipeline Modules API

Kinematics

This module contains the tools for modelling kinematic effects on spectra.

class besta.kinematics.GaussHermite(order, *args, **kwargs)[source]

Bases: Fittable1DModel

Gauss-Hermite model.

property param_names: Tuple of Gaussian and Hermite coefficient parameter names.

evaluate(x, *params)[source]

Evaluate the Gauss-Hermite profile.

Parameters:

x (array_like) – Coordinate values where the profile is evaluated.
*params – Gaussian parameters followed by Hermite coefficients.

Returns:

array_like – Profile values at x.

besta.kinematics.losvd(vel_pixel, sigma_pixel, h3=0, h4=0)[source]: Evaluate a Gauss-Hermite line-of-sight velocity distribution kernel.

besta.kinematics.get_losvd_kernel(kernel_model, x_size)[source]

Create a Model1DKernel from an input Model.

Parameters:

kernel_model (astropy.models.FittableModel) – Model used to build the kernel.
x_size (int) – Kernel size

Returns:

kernel (Model1DKernel) – Kernel model

besta.kinematics.convolve_spectra_with_kernel(spectra, kernel, use_fft=True)[source]

Convolve an input spectra with a given kernel.

Parameters:

spectra (np.ndarray) – Target spectra to convolve with the kernel.
kernel (Model1DKernel) – Convolution kernel.

Returns:

convolved_spectra (np.ndarray) – Spectra convolved with the input kernel.

besta.kinematics.convolve_ssp_with_lsf(ssp, lsf_sigma_pixels)[source]

Convolve a given SSP model with an LSF.

Parameters:

ssp (pst.SSP.SSPBase)
lsf_sigma_pixels (float, np.ndarray) – Gaussian standard deviation of the LSF.

besta.kinematics.convolve_ssp(module_config, los_sigma, los_vel, los_h3=0.0, los_h4=0.0)[source]: Convolve SSP spectra stored in a module configuration with LOS kinematics.

besta.kinematics.convolve_ssp_model(module_config, los_sigma, los_vel, h3=0.0, h4=0.0)[source]: Convolve an SSP model instance in place with LOS kinematics.

besta.kinematics.normal_cdf(x, mu=0.0, sigma=1.0)[source]: Normal cumulative density function.

besta.kinematics.convolve_variable_gaussian_kernel(spectra, sigma_pixel, kappa_sigma_thresh=3.0, kappa_trunc=5.0)[source]

Convolve an input spectra with a Gaussian kernel of varying width, using a sparse banded convolution matrix.

Parameters:

spectra (np.ndarray or astropy.units.Quantity) – N-dimensional spectra. The last dimension must correspond to the wavelength axis.
sigma_pixel (np.ndarray) – Standard deviation of the Gaussian LSF for each spectral resolution element, expressed in pixel units (length = n_pix).
kappa_sigma_thresh (float, optional) – Threshold in units of the Gaussian sigma. If a single pixel contains at least kappa_sigma_thresh sigmas of the LSF, that pixel is left untouched (i.e. the convolution kernel is clamped to a delta function at that pixel). Default is 3.0.
kappa_trunc (float, optional) – Truncation radius of the Gaussian in units of sigma. Only pixels within ±kappa_trunc * sigma of the central pixel contribute to the kernel. Default is 5.0.

Returns:

convolved_spectra (np.ndarray or astropy.units.Quantity) – A convolved version of spectra.

This module contains classes and functions related to dealing with spectra

besta.spectrum.get_legendre_polynomial_array(wavelength, order, bounds=None, scale=None, clip_first_zero=True)[source]

Compute an array of Legendre polynomials evaluated at normalized wavelengths.

Parameters:

wavelength (numpy.ndarray) – Array of wavelength values.
order (int) – The maximum order of the Legendre polynomial to compute.
bounds (tuple, optional) – A tuple specifying the minimum and maximum bounds for normalization (bounds[0], bounds[1]). If None, the normalization is based on the minimum and maximum of the wavelength array.
scale (float, optional) – A maximum scale to probe by the polynomials. If provided, the set of polynomial will comprise the range that is sensitive to scales smaller the input value (i.e. lower order polynomials are not included).
clip_first_zero (bool, optional) – If True, the values of each polynomial below the first and las zero of the Legendre polynomial are set to 0. This prevents the edges to reach extremelly large values when the order of the polynomial is high.

Returns:

numpy.ndarray – A 2D array where each row corresponds to the values of a Legendre polynomial of a given degree, evaluated at the normalized wavelengths. The shape of the array is (order + 1, len(wavelength)).

besta.spectrum.legendre_decorator(make_observable_mthd)[source]: Include multiplicative Legendre polynomials during a fit.

class besta.spectrum.TelluricBand(name: str, wmin: float, wmax: float)[source]

Bases: object

Named wavelength interval affected by telluric absorption.

besta.spectrum.mask_telluric_regions(wavelength: ndarray, *, weight: ndarray | None = None, bands: Sequence[Tuple[float, float]] | None = None, pad: float = 0.0, redshift: float = 0.0, return_mask: bool = False) → ndarray | tuple[ndarray, ndarray, list[TelluricBand]][source]

Mask regions affected by telluric absorption by setting weights to 0.

Parameters:

wavelength (ndarray) – 1D array of wavelength values in Angstrom.
bands (sequence of (wmin, wmax), optional) – Telluric band edges in the same units as wavelength. If None, uses a default set.
pad (float, optional) – Extra padding (same units as wavelength) added to both sides of every band: [wmin - pad, wmax + pad].
return_mask (bool, optional) – If True, also return the boolean mask of where weights were set to 0.

Returns:

new_weight (ndarray) – Copy of input weight with telluric regions set to 0.
mask (ndarray of bool, optional) – If return_mask is True, also return the boolean mask of where weights were set to 0.

class besta.spectrum.EmissionLine(name: str, rest_wavelength: float, default_half_width: float = 10.0, flux: float | None = None, flux_error: float | None = None, rest_wavelength_error: float | None = None, flag: int = 0, metadata: dict = <factory>)[source]

Bases: object

Emission line definition.

class besta.spectrum.EmissionLineList(lines: Iterable[EmissionLine])[source]

Bases: object

Collection of emission lines with helper methods.

property rest_wavelengths: ndarray: Array of rest wavelengths of the lines.

property names: list[str]: List of line names.

property default_half_widths: ndarray: Array of default half-widths of the lines.

property fluxes: ndarray: Array of measured line fluxes.

property flux_errors: ndarray: Array of measured line-flux uncertainties.

property rest_wavelength_errors: ndarray: Array of line-center uncertainties.

property flags: ndarray: Array of line quality flags.

__iter__()[source]: Allow iteration over the lines in the list.

get_observed_wavelengths(redshift: float) → ndarray[source]

Compute observed wavelengths of the lines given a redshift.

Parameters:: redshift (float) – Redshift to apply to the rest wavelengths.
Returns:: np.ndarray – Array of observed wavelengths corresponding to the input redshift.

crossmatch_line_list(line_list: EmissionLineList)[source]: Cross-match this line list with another line list and return the matched lines.

with_names(names: Sequence[str]) → EmissionLineList[source]: Return a copy of the line list with updated names.

get_closest_line(wavelength: float, *, redshift: float = 0.0) → EmissionLine | None[source]: Return the line with observed wavelength closest to the input value.

get_line_by_observed_wavelength(wavelength: float, redshift: float, tol: float = 5.0) → EmissionLine | None[source]

Match an input line to the list of emission lines.

Parameters:

wavelength (float) – Observed wavelength to match.
redshift (float) – Redshift to apply to the rest wavelengths.
tol (float, optional) – Tolerance for matching, by default 5.0.

Returns:

Optional[EmissionLine] – The matched emission line, or None if no match is found.

to_mask(wavelength, redshift=0.0) → ndarray[source]

Create a boolean mask for the input wavelength array where the lines are located.

Parameters:

wavelength (np.ndarray) – Array of wavelength values to create the mask for.
redshift (float, optional) – Redshift to apply to the rest wavelengths, by default 0.0.

Returns:

np.ndarray – Boolean array of the same shape as wavelength where True indicates the presence of a line.

to_table(filename: str = None, **table_kwargs) → Table[source]

Save emission lines to a file.

Parameters:: filename (str, optional) – If provided, the table is saved to this file.
Returns:: astropy.table.Table – Table containing the emission line information.

classmethod from_table(table: Table) → EmissionLineList[source]

Create an EmissionLineList from an astropy Table.

Parameters:: table (astropy.table.Table) – Table containing the emission line information with columns: name, rest_wavelength, default_half_width.
Returns:: EmissionLineList – An instance of EmissionLineList created from the input table.

classmethod from_file(filename: str, **table_kwargs) → EmissionLineList[source]

Load emission lines from a file with columns.

Parameters:

filename (str) – Path to the file containing the emission line data.
**table_kwargs – Additional keyword arguments to pass to astropy.table.Table.read.

Returns:

EmissionLineList – An instance of EmissionLineList containing the loaded emission lines.

besta.spectrum.get_default_emission_lines()[source]

Return the default optical emission-line list.

Returns:: EmissionLineList – Default list of common rest-frame galaxy emission lines with nominal masking half-widths in Angstrom.

besta.spectrum.get_default_sky_emission_lines()[source]

Return the default sky emission-line list.

Returns:: EmissionLineList – Default list of common sky emission lines with nominal masking half-widths in Angstrom.

besta.spectrum._parse_lines_param_decorator(func)[source]: Decorator to parse the lines parameter.

besta.spectrum.mask_strong_emission_lines(wavelength: ndarray, flux: ndarray, uncertainty: ndarray, weight: ndarray, *, redshift: float = 0.0, lines: Sequence[EmissionLine] | None = None, snr_threshold: float = 5.0, min_continuum_snr: float = 1.0, cont_half_window: int = 75, cont_sigma_clip: float = 4.0, half_width: float | None = None, width_mode: str = 'fixed', fwhm_pixels: float = 3.0, max_half_width: float | None = None, pad: float = 0.0, return_mask: bool = False, return_lines_masked: bool = False) → ndarray | tuple[ndarray, ndarray] | tuple[ndarray, ndarray, list[EmissionLine]][source]

Identify strong emission lines and mask them by setting weight=0.

This is a robust, low-assumption approach:

For each expected line center (rest -> observed using z), estimate a local continuum with a running median in a window around the line.
Compute line “excess” = flux - continuum and its S/N using uncertainty.
If peak S/N within the line window exceeds snr_threshold (and the continuum is not completely noise-dominated), mask the line region.

Parameters:

wavelength, flux, uncertainty, weight (ndarray) – 1D arrays of the same length.
z (float, optional) – Redshift used to place the line centers: lambda_obs = lambda_rest*(1+z).
lines (sequence of EmissionLine, optional) – Line list. If None, uses DEFAULT_EMISSION_LINES_A (Angstrom).
snr_threshold (float, optional) – Minimum peak (flux-continuum)/sigma within the line window to mask.
min_continuum_snr (float, optional) – Require median(abs(continuum)/sigma) in the continuum window to be at least this value; helps avoid false positives when everything is noise.
cont_half_window (int, optional) – Half-window size in pixels for local continuum estimation around each line.
cont_sigma_clip (float, optional) – Sigma-clipping level applied to the continuum residuals when estimating the median continuum (simple, robust clip).
half_width (float, optional) – Half-width of the masked region in wavelength units. If None, uses each line’s default_half_width.
width_mode ({“fixed”,”fwhm_pixels”}, optional) –
- “fixed”: uses half_width (or per-line default) in wavelength units.
- “fwhm_pixels”: convert fwhm_pixels to wavelength half-width using the local dispersion (median delta-lambda near the line) and mask +-k*FWHM, where k=1.5 (roughly covers wings).
fwhm_pixels (float, optional) – Only used if width_mode=”fwhm_pixels”. Instrumental/profile FWHM in pixels.
max_half_width (float, optional) – Cap the computed half-width (in wavelength units) to avoid huge masks.
pad (float, optional) – Extra padding added to half-width (same wavelength units).
return_mask (bool, optional) – If True, also return boolean mask of where weights were set to 0.
return_lines_masked (bool, optional) – If True, also return list of line names that were actually masked.

Returns:

new_weight (ndarray) – Copy of weight with emission-line regions set to 0.
mask (ndarray of bool, optional) – True where masked.
lines_masked (list of str, optional) – Names of lines that triggered masking.

Notes

This targets strong, narrow-ish features near known lines. It will not detect arbitrary lines at unknown wavelengths unless you expand the line list.
If you already have a model continuum, you can replace the continuum estimation with that for even more robustness.

besta.spectrum.mask_sky_emission_lines(*args, **kwargs) → ndarray | tuple[ndarray, ndarray, list[EmissionLine]][source]

Specialized wrapper around mask_strong_emission_lines to target sky lines.

This is a robust, low-assumption approach:

For each expected line center (rest to observed using redshift), estimate a local continuum with a running median in a window around the line.
Compute line excess as flux - continuum and its S/N using uncertainty.
If peak S/N within the line window exceeds snr_threshold, mask the line region.

Parameters:

See :func:`mask_strong_emission_lines` for details. The only difference is
that if ``lines`` is None, a default set of common sky emission lines is
used.

Notes

This targets strong, narrow-ish features near known lines. It will not detect arbitrary lines at unknown wavelengths unless you expand the line list.
The default line list is focused on common sky lines in the optical/NIR, but you can provide your own list for other regimes.

besta.spectrum.estimate_continuum(wl, flux, err, weights=None, use_log=True, knot_spacing=100.0, sigma_clip=3.0)[source]

Estimate continuum using a robust spline fit to the log-flux.

Parameters:

wl (ndarray) – Wavelength array (1D).
flux (ndarray) – Flux array (1D, same length as wl).
err (ndarray) – Uncertainty array (1D, same length as wl).
weights (ndarray, optional) – Weights for each pixel (1D, same length as wl). If None, all pixels are equally weighted.
knot_spacing (float, optional) – Spacing between spline knots in the same units as wl (e.g., Angstrom).
sigma_clip (float, optional) – Sigma-clipping threshold for outlier rejection in the log-flux residuals.
use_log (bool, optional) – Whether to fit the log of the flux (True) or the flux itself (False). Fitting log-flux is more robust to outliers and multiplicative features, but may be less accurate if there are many zero/negative flux values.

Returns:

continuum (ndarray) – Estimated continuum flux at each wavelength.
continuum_err (ndarray) – Estimated uncertainty of the continuum at each wavelength.

class besta.spectrum.LineSegmentationMap(line_segmentation: ndarray, flux: ndarray, wavelength: ndarray, error: ndarray, weights: ndarray, continuum: ndarray, continuum_error: ndarray)[source]

Bases: object

Container for detected emission-line segments and fitted line products.

Parameters:

line_segmentation (ndarray) – Integer segmentation map where 0 marks non-line pixels and positive values identify detected line regions.
flux (ndarray) – Observed flux array.
wavelength (ndarray) – Wavelength array corresponding to flux.
error (ndarray) – Flux uncertainty array.
weights (ndarray) – Pixel weights used when fitting line profiles.
continuum (ndarray) – Estimated continuum flux array.
continuum_error (ndarray) – Uncertainty of the estimated continuum.

flux_cont_sub

Continuum-subtracted flux.

Type:: ndarray

flux_cont_sub_err

Uncertainty of the continuum-subtracted flux.

Type:: ndarray

nlines

Number of labeled line regions in line_segmentation.

Type:: int

lines

Fitted line measurements after calling fit_all_lines().

Type:: EmissionLineList or None

get_line_mask(line_id: int) → ndarray[source]: Return a boolean mask for the specified line_id.

fit_line(line_id: int) → dict[source]: Fit a Gaussian to the specified line_id and return fit parameters.

fit_all_lines() → Table[source]: Fit all lines in the segmentation map and return a table of results.

besta.spectrum._watershed_1d(signal: ndarray, markers: ndarray, mask: ndarray) → ndarray[source]

1D watershed segmentation for deblending overlapping emission lines.

Starting from labeled seed regions (markers), flood outward following descending signal gradient until regions meet or the mask boundary is reached.

Parameters:

signal (ndarray) – 1D array used to guide the flooding (typically SNR). Higher values are flooded first.
markers (ndarray) – Integer label array with seed pixels (>0) already identified. 0 means unlabeled; will be filled by watershed.
mask (ndarray of bool) – Only pixels where mask is True are eligible for flooding.

Returns:

labels (ndarray of int) – Label array after watershed. Pixels outside the mask remain 0.

besta.spectrum.find_emission_lines(wl, flux, err, weights=None, continuum=None, continuum_error=None, lines=None, redshift=0.0, to_rest_frame=False, snr_threshold=3.0, min_continuum_snr=1.0, cont_sigma_clip=1.5, knot_spacing=100.0, min_npixels: int = 3, pad_detection_pix: int = 10, deblend_lines: bool = True)[source]

Find and fit emission lines in a spectrum.

Parameters:

wl (ndarray) – Wavelength array (1D).
flux (ndarray) – Flux array (1D, same length as wl).
err (ndarray) – Uncertainty array (1D, same length as wl).
weights (ndarray, optional) – Weights for each pixel (1D, same length as wl). If None, all pixels are equally weighted.
continuum (ndarray, optional) – Pre-computed continuum flux at each wavelength. If None, it will be estimated.
continuum_error (ndarray, optional) – Uncertainty of the continuum at each wavelength. If None, it will be estimated.
lines (sequence of EmissionLine, optional) – Line list. If None, uses DEFAULT_EMISSION_LINES_A (Angstrom).
snr_threshold (float, optional) – Minimum peak (flux-continuum)/sigma to consider a line detected.
min_continuum_snr (float, optional) – Require median(abs(continuum)/sigma) in the continuum window to be at least this value to consider a line detection valid; helps avoid false positives when everything is noise.
cont_sigma_clip (float, optional) – Sigma-clipping threshold for outlier rejection in the continuum estimation.
knot_spacing (float, optional) – Spacing between spline knots for continuum estimation in the same units as wl (e.g., Angstrom).
min_npixels (int, optional) – Minimum number of contiguous pixels above the SNR threshold to consider a valid line detection.
pad_detection_pix (int, optional) – Number of pixels to pad on either side of detected line regions to ensure the full line is captured, especially for broad lines.
deblend_lines (bool, optional) – If True, apply 1D watershed deblending to separate overlapping line detections. Each local SNR maximum seeds its own region and regions are grown by flooding in descending SNR order until they meet. Default is True.

Returns:

output_table (astropy.table.Table) – Table with columns: line_flux, line_flux_err, center, sigma, npixels, flag.
continuum (ndarray) – Estimated continuum flux at each wavelength.
continuum_error (ndarray) – Estimated uncertainty of the continuum at each wavelength.

Star formation histories

Star formation history fitting module

besta.sfh._softmax(x)[source]: Numerically stable softmax.

besta.sfh._sigmoid(x)[source]: Simple sigmoid to clamp values into (0, 1).

besta.sfh._logit(x)[source]: Inverse of sigmoid on (0,1).

besta.sfh.validate_monotonic(array, *, strict=True, name='array')[source]: Validate that the input array is monotonic increasing.

class besta.sfh.SFHBase(*args, **kwargs)[source]

Bases: ABC

Star formation history model.

Description

This class serves as an interface between the sampler and PST SFH models.

free_params

Dictionary containing the free parameters of the model and their respective range of validity.

Type:: dict

redshift

Cosmological redshift at the time of the observation.

Type:: float, optional, default=0.0

today

Age of the universe at the time of observation. If not provided, it is computed using the default cosmology and the value of redshift.

Type:: astropy.units.Quantity

to_physical(latent)[source]: Map latent parameters to physical space (default: identity).

to_latent(physical)[source]: Map physical parameters back to latent space (default: identity).

make_ini(ini_file)[source]

Create a cosmosis .ini file.

Parameters:: ini_file (str) – Path to the output .ini file.

parse_free_params(free_params: dict)[source]

Parse the SFH model free parameters from a dictionary.

Parameters:: free_params (dict) – Dictionary containing the SFH model free parameters.

abstractmethod parse_datablock(*args)[source]: Parse the SFH model free parameters from a DataBlock.

class besta.sfh.ZPowerLawMixin[source]

Bases: object

Metallicity evolution as a power law in terms of the stellar mass formed.

class besta.sfh.PieceWiseSFHMixin[source]

Bases: object

Piece-wise star formation history model mixin.

This mixing provides the common properties of piece-wise SFH models.

property sfh_bin_keys: Keys associated to the bins of the SFH model.

get_sfh_parameters_array(datablock: DataBlock, dtype=<class 'float'>)[source]

Get an array containing the values of each bin of the SFH.

Parameters:: datablock (DataBlock) – The datablock containing the values of each parameter of the SFH.

class besta.sfh.FixedTimeSFH(lookback_time_bins, *args, **kwargs)[source]

Bases: ZPowerLawMixin, SFHBase, PieceWiseSFHMixin

A SFH model with fixed time bins.

Description

The SFH of a galaxy is modelled as a stepwise function where the free parameters correspond to the mass fraction formed on each bin.

Upon initializaiton, the free parameters are set between -8 to 0 in terms of log(M/Msun). The starting point corresponds to the mass fraction formed assuming a constant star formation history.

lookback_time

Lookback time bin edges.

Type:: astropy.units.Quantity

parse_datablock(datablock: DataBlock)[source]: Update the fixed-time SFH model from a CosmoSIS DataBlock.

to_latent(physical)[source]: Inverse of the softmax branch (up to an additive constant). We center the log-fractions to have zero mean for determinism.

to_physical(latent)[source]: Softmax-transform latent masses into fractions that sum to 1.

class besta.sfh.FixedTime_sSFR_SFH(lookback_time, *args, **kwargs)[source]

Bases: ZPowerLawMixin, SFHBase, PieceWiseSFHMixin

A SFH model with fixed time bins.

Description

The SFH of a galaxy is modelled as a stepwise function where the free parameters correspond to the average sSFR over the last lookback_time.

lookback_time

Lookback time bin edges.

Type:: astropy.units.Quantity

parse_datablock(datablock: DataBlock)[source]: Update the fixed-time sSFR model from a CosmoSIS DataBlock.

to_physical(latent)[source]

Return log10 sSFR over each timescale.

If transforms are enabled, map latent vars -> softmax increments -> cumulative mass, then convert to average sSFR over each interval.

to_latent(physical)[source]: Inverse mapping from log10 sSFR back to latent (softmax) space.

class besta.sfh.FixedMassFracSFH(mass_fraction, *args, **kwargs)[source]

Bases: ZPowerLawMixin, SFHBase, PieceWiseSFHMixin

A SFH model with fixed mass fraction bins.

Description

The SFH of a galaxy is modelled as a stepwise function where the free parameters correspond to the time at which a given fraction of the total stellar mass was formed.

mass_fractions

SFH mass fractions.

Type:: np.ndarray

lookback_time

Lookback time bin edges.

Type:: astropy.units.Quantity

parse_datablock(datablock: DataBlock)[source]: Update the fixed-mass-fraction SFH model from a CosmoSIS DataBlock.

to_physical(latent)[source]: Map unconstrained latents to strictly increasing times (Gyr).

to_latent(physical)[source]: Inverse of the exp/cumsum mapping (up to normalisation).

class besta.sfh.ExponentialSFH(*args, **kwargs)[source]

Bases: ZPowerLawMixin, SFHBase

An analytical exponentially declining SFH model.

Description

The SFH of a galaxy is modelled as an exponentially declining function.

time

Time bins to evaluate the SFH.

Type:: astropy.units.Quantity

lookback_time

Type:: astropy.units.Quantity

parse_datablock(datablock: DataBlock)[source]: Update the exponential SFH model from a CosmoSIS DataBlock.

class besta.sfh.DelayedTauSFH(*args, **kwargs)[source]

Bases: ZPowerLawMixin, SFHBase

An exponentially declining delayed-tau SFH model.

Description

The SFH of a galaxy is modelled as an exponentially declining function.

\[M_\star(t) = M_{inf} \cdot (1 - e^{-t/\tau} \frac{t + \tau}{\tau})\]

time

Time bins to evaluate the SFH.

Type:: astropy.units.Quantity

parse_datablock(datablock: DataBlock)[source]: Update the delayed-tau SFH model from a CosmoSIS DataBlock.

class besta.sfh.DelayedTauQuenchedSFH(*args, **kwargs)[source]

Bases: ZPowerLawMixin, SFHBase

An exponentially declining delayed-tau SFH model with a quenching event.

Description

The SFH of a galaxy is modelled as an exponentially declining function.

\[M_\star(t) = M_{inf} \cdot (1 - e^{-t/\tau} \frac{t + \tau}{\tau})\]

After the quenching event, taking place at $t_{quench}$, the stellas mass will be $M_\star(t)=M_\star(t_{quench})$ for all times larget than $t_{quench}$.

time

Time bins to evaluate the SFH.

Type:: astropy.units.Quantity

parse_datablock(datablock: DataBlock)[source]: Update the quenched delayed-tau SFH model from a CosmoSIS DataBlock.

class besta.sfh.LogNormalSFH(*args, **kwargs)[source]

Bases: ZPowerLawMixin, SFHBase

An analytical log-normal declining SFH model.

Description

The SFH of a galaxy is modelled as an log-normal declining function.

time

Time bins to evaluate the SFH.

Type:: astropy.units.Quantity

lookback_time

Type:: astropy.units.Quantity

parse_datablock(datablock: DataBlock)[source]: Update the log-normal SFH model from a CosmoSIS DataBlock.

class besta.sfh.LogNormalQuenchedSFH(*args, **kwargs)[source]

Bases: ZPowerLawMixin, SFHBase

An analytical log-normal declining SFH model including a quenching event.

Description

The SFH of a galaxy is modelled as an log-normal declining function. A quenching event is modelled as an additional exponentially declining function that is applied after the time of quenching.

time

Time bins to evaluate the SFH.

Type:: astropy.units.Quantity

lookback_time

Type:: astropy.units.Quantity

parse_datablock(datablock: DataBlock)[source]: Update the quenched log-normal SFH model from a CosmoSIS DataBlock.

Utilities

General utility helpers used across BESTA.

besta.utils.mkdir(path: str) → None[source]: Create a directory path if it does not already exist.

besta.utils.available_memory_bytes() → int[source]: Return the currently available system memory in bytes.

besta.utils.convert_bytes(size_bytes, to_unit)[source]: Convert a byte count into B, KB, MB, or GB.

besta.utils.predict_array_memory(array_shape, dtype=<class 'numpy.float64'>, unit='MB')[source]

Predict the memory required for a NumPy array with given shape and data type.

Parameters:

array_shape (tuple) – Shape of the array (e.g., (1000, 1000) for a 1000x1000 array).
dtype (numpy.dtype, optional) – NumPy data type (default is np.float64).
unit ({‘B’, ‘KB’, ‘MB’, ‘GB’}, optional) – Unit for the returned memory size (default is ‘MB’).

Returns:

float – Estimated memory requirement in the specified unit.

Raises:

ValueError – If invalid dtype or unit is provided.

Examples

>>> predict_array_memory((1000, 1000))
7.63  # MB for float64 array

>>> predict_array_memory((500, 500, 500), np.float32, 'GB')
0.47  # GB for float32 array

besta.utils.check_array_memory(array_shape, dtype=<class 'numpy.float64'>, unit='MB', safety_margin=0.2)[source]

Check if the current machine has enough RAM for creating a given array.

Parameters:

array_shape (tuple) – Shape of the array (e.g., (1000, 1000) for a 1000x1000 array).
dtype (numpy.dtype, optional) – NumPy data type (default is np.float64).
unit ({‘B’, ‘KB’, ‘MB’, ‘GB’}, optional) – Unit for error message reporting (default is ‘MB’).
safety_margin (float, optional) – Additional margin (fraction of the array size) for ensuring good performance. Default is 0.2 (20% more than the predicted memory requirement).

Returns:

bool – True if sufficient memory is available.

Raises:

MemoryError – If insufficient memory is available for the array.
ValueError – If invalid dtype, shape, or unit is provided.

Examples

>>> check_array_memory((10000, 10000))
True  # If 800MB+ available

>>> check_array_memory((50000, 50000))
MemoryError: Insufficient memory available...

besta.utils.expand_env_vars(arg_spec=0)[source]

Decorator that expands environment variables in a specified argument.

The target argument can be specified either by position (int) or name (str). If no argument is specified, expands the first positional argument (default).

Parameters:: arg_spec (int or str, optional) – Either the position (0-based index) or name of the argument to expand. Default is 0 (first positional argument).
Returns:: callable – A decorator function that wraps the original function.

Examples

# Expand first positional argument (default) >>> @expand_env_vars() … def load_file(path): … print(path) >>> load_file(“$HOME/test.txt”)

# Expand named argument >>> @expand_env_vars(‘filename’) … def process_file(filename, mode=’r’): … print(filename, mode) >>> process_file(“${TMPDIR}/data.txt”)

# Expand argument at position 1 >>> @expand_env_vars(1) … def save_data(header, filepath): … print(header, filepath) >>> save_data(“results”, “$APPDIR/output.dat”)

Visualization helpers for compact textual summaries in matplotlib figures.

besta.visualization.draw_dict_in_axes(ax, info: Mapping | Sequence[Mapping] | Sequence[Tuple[str, Mapping]], loc: Tuple[float, float] = (0.02, 0.98), max_fontsize: int = 14, min_fontsize: int = 6, family: str = 'monospace', ha: str = 'left', va: str = 'top', line_spacing: float = 1.0, section_spacing: int = 1, title_style: str = 'plain', sort_keys: bool = False)[source]

Render one or more dictionaries as readable text inside a matplotlib Axes, automatically adjusting font size so the text fits within the axes.

Parameters:

ax (matplotlib.axes.Axes) – Target axes.
info (dict or sequence) –
One of:
- dict: a single section without a title
- list/tuple of dict: multiple sections (auto-titled “Section 1”, …)
- list/tuple of (title, dict): multiple titled sections
Each dict should map keys to numerical (or printable) values.
loc (tuple, optional) – (x, y) location in axes coordinates (default top-left).
max_fontsize (int, optional) – Starting (largest) font size.
min_fontsize (int, optional) – Minimum font size allowed.
family (str, optional) – Font family (monospace recommended).
ha, va (str, optional) – Horizontal / vertical alignment.
line_spacing (float, optional) – Line spacing multiplier.
section_spacing (int, optional) – Number of blank lines between sections.
title_style ({“plain”,”underline”}, optional) – How to render section titles.
sort_keys (bool, optional) – If True, sort keys within each section.

Returns:

text (matplotlib.text.Text) – The text artist.

Model grid-based inference

For a user-oriented guide (without CosmoSIS sampling), see Grid-Based Inference.

Basics

class besta.grid.grid.ModelGrid(observables: np.ndarray, targets: np.ndarray, observable_names: List[str], target_names: List[str], weights: Optional[np.ndarray] = None, meta: Dict[str, Any] = <factory>, check_boundaries: Optional['Callable[[np.ndarray], bool]'] = None)[source]

Bases: object

Base container for a grid of models.

This class stores:

observables: array of shape (N, P)
targets: array of shape (N, Q) with continuous quantities
names and metadata for both
optional per-model weights

observables

Observable vectors per model, e.g. colours or fluxes.

Type:: ndarray, shape (N, P)

targets

Continuous targets per model, e.g. SFR, logM, metallicity.

Type:: ndarray, shape (N, Q)

observable_names

Names of the P observables in order.

Type:: list of str

target_names

Names of the Q targets in order.

Type:: list of str

weights

Optional sampling weights for models. Defaults to uniform if None.

Type:: ndarray, shape (N,), optional

meta

Free-form metadata for provenance, settings, and units.

Type:: dict, optional

check_boundaries

Function to check if model parameters are within valid boundaries. If None, no boundary checks are performed. The function should take a model parameter vector as input and return a boolean indicating whether the parameters are valid. This is only used during interpolation.

Type:: callable, optional

observable_standardiser

Standardiser for observable quantities.

Type:: LinearStandardiser

target_standardiser

Standardiser for target quantities.

Type:: LinearStandardiser

Examples

>>> grid = ModelGrid(
...     observables=np.array([[1.0, 0.5], [0.8, 0.3], [1.2, 0.7]]),
...     targets=np.array([[0.1, 0.2], [0.3, 0.4], [0.5, 0.6]]),
...     observable_names=["obs1", "obs2"],
...     target_names=["target1", "target2"]
... )
>>> print(grid.n_models)
3
>>> print(grid.n_observables)
2

property n_models: int: Number of models N.

property n_observables: int: Number of observables P.

property n_targets: int: Number of targets Q.

property grid_int: type: Adaptative int type that accounts for model dimensionality.

property grid_float: type: Adaptative float type that accounts for model precision.

get_target(key)[source]: Return a target column given a key.

select(idx: ndarray, observables=None, targets=None, standardisers=True) → ModelGrid[source]

Return a new ModelGrid containing a subset of models.

Parameters:

idx (ndarray of int, shape (K,)) – Indices to select.
observables (list of str or list of int, optional, default=None) – Names or indices of observables to include.
targets (list of str or list of int, optional, default=None) – Names or indices of targets to include.
standardisers (bool, optional, default=True) – Whether to propagate fitted standardisers.

Returns:

sub (ModelGrid) – Subset grid view (copies arrays).

fit_target_standardiser(mask: ndarray | None = None) → None[source]

Fit mean/std for target standardisation used by KDTree distances.

Parameters:: mask (ndarray of bool, shape (N,), optional) – If provided, use only masked entries to compute stats.

transform_targets(X: ndarray) → ndarray[source]

Apply fitted target standardisation.

Parameters:: X (ndarray, shape (…, Q)) – Targets to standardise.
Returns:: X_std (ndarray, shape (…, Q)) – Standardised targets.
Raises:: RuntimeError – If fit_target_standardiser has not been called.

inverse_transform_targets(X_std: ndarray) → ndarray[source]: Inverse of transform_targets.

fit_standardiser(mask: ndarray | None = None) → None[source]

Fit mean and standard deviation for observable standardisation.

Parameters:: mask (ndarray of bool, shape (N,), optional) – If provided, use only masked entries to compute stats.

_ensure_standardized_observables() → ndarray[source]: Return cached standardized observables, computing them lazily if needed.

_ensure_standardized_targets() → ndarray[source]: Return cached standardized targets, computing them lazily if needed.

transform_observables(X: ndarray) → ndarray[source]

Apply fitted observable standardisation.

Parameters:: X (ndarray, shape (…, P)) – Observables to standardise.
Returns:: X_std (ndarray, shape (…, P)) – Standardised observables.
Raises:: RuntimeError – If fit_standardiser has not been called.

inverse_transform_observables(X_std: ndarray) → ndarray[source]

Inverse of transform_observables.

Parameters:: X_std (ndarray, shape (…, P)) – Standardised observables to inverse.
Returns:: X (ndarray, shape (…, P)) – Inverse transformed observables.
Raises:: RuntimeError – If fit_standardiser has not been called.

invalidate_knn_index() → None[source]: Invalidate cached KDTree.

get_kdtree(*, standardize: bool = True)[source]: Get a KDTree either stored in cash or built on the fly.

in_boundaries(targets_query: ndarray) → ndarray[source]

Check which target queries are within the model grid boundaries.

Parameters:: targets_query (ndarray, shape (M, Q)) – Target query points.
Returns:: in_bounds (ndarray of bool, shape (M,)) – Whether each query is within the grid boundaries.

interpolate_observables(targets_query: ndarray, *, method: str | None = None, fill_value: float = nan, k: int = 32, p: float = 2.0, eps: float = 1e-12, standardize: bool = True, mode: str = 'local_linear', ridge: float = 1e-08) → ndarray[source]

Interpolate observables for arbitrary target values using cached KDTree.

mode=”nearest” : nearest-neighbour lookup mode=”idw” : inverse-distance weighted KNN mode=”local_linear”: weighted local affine fit (exact for linear functions)

static _std_to_state(std) → Dict[str, Any][source]: Serialisable state for LinearStandardiser.

static _state_to_std(state: Mapping[str, Any], std) → None[source]: Restore LinearStandardiser from state into an existing instance.

to_dict() → Dict[str, Any][source]

Serialise grid to a dictionary that is np.savez / JSON-friendly (except numpy arrays, which are fine for np.savez).

Notes

check_boundaries is NOT serialised (it is generally not portable).

If you need it, store a string key in meta and resolve via a registry.

classmethod from_dict(d: Mapping[str, Any], *, check_boundaries: 'Callable[[np.ndarray], np.ndarray]' | None = None) → ModelGrid[source]

Build a ModelGrid from a dictionary as produced by to_dict().

Parameters:

d (mapping) – Mapping produced by to_dict().
check_boundaries (callable, optional) – Optional boundary function to attach at load time. (Not serialised by default.)

classmethod from_fits_table(path, observable_cols=None, target_cols=None, weight_col=None, row_mask=None, table_hdu=1, memmap=False, meta_key='MODELGRID_META')[source]

Build a ModelGrid from an Astropy FITS table.

Priority order:

Auto-load if table.meta contains OBSNAME and TGTNAME written by to_fits_table.
If explicit mappings (observable_cols / target_cols) are provided, use them.
Fallback: load all numeric 1-D columns as observables; no targets.

Parameters:

path (str) – Path to the FITS file.
observable_cols (list[str] or dict[str, str], optional) – Explicit mapping for observables. If None, try auto-discovery.
target_cols (list[str] or dict[str, str], optional) – Explicit mapping for targets. If None, try auto-discovery.
weight_col (str, optional) – Column name for per-model weights. If None, will auto-use “weight” if present.
row_mask (array_like of bool, optional) – Boolean mask of length N to select rows after reading.
table_hdu (int or str, optional) – FITS HDU index or name containing the table. Default 1.
memmap (bool, optional) – astropy Table read memmap. Default False.
meta_key (str, optional) – Table.meta key with JSON meta blob written by to_fits_table. Default “MODELGRID_META”.

Returns:

grid (ModelGrid)

classmethod from_hdf5(path, group='/modelgrid', observable_dsets=None, target_dsets=None, weight_dset=None, row_slice=None)[source]

Build a ModelGrid from an HDF5 file, with automatic discovery.

Priority order:

Auto-load if the group contains subgroups ‘observables’ and ‘targets’ as written by to_hdf5 (reads names and meta from attributes).
If explicit mappings (observable_dsets / target_dsets) are provided, use them.
Fallback: load all 1-D numeric datasets directly under the group as observables; no targets.

Parameters:

path (str) – Path to the HDF5 file.
group (str, optional) – Group path to read from. Default “/modelgrid”.
observable_dsets (list[str] or dict[str, str], optional) – Explicit dataset mapping for observables (relative to group).
target_dsets (list[str] or dict[str, str], optional) – Explicit dataset mapping for targets (relative to group).
weight_dset (str, optional) – Dataset name for weights (relative to group). If None, will use “weights” if present in group.
row_slice (slice or array_like of int or bool, optional) – Optional selection of rows after reading.

Returns:

grid (ModelGrid)

to_hdf5(path, group='/modelgrid', overwrite=False, compression='gzip', compression_opts=4, chunks=True, include_meta=True)[source]

Save the grid as an HDF5 group with per-column datasets.

Layout

{group}/observables/<name>: One 1-D dataset per observable.
{group}/targets/<name>: One 1-D dataset per target.
{group}/weights: Optional 1-D dataset if model weights are present.
Attributes on {group}: observable_names, target_names, n_models, n_observables, n_targets, and meta (JSON when include_meta=True).

Parameters:

path (str) – Output HDF5 path. File will be created if it does not exist.
group (str, optional) – Group path where to store the grid. Default “/modelgrid”.
overwrite (bool, optional) – If True and the group exists, it will be deleted and recreated. Default False.
compression (str or None, optional) – Compression for datasets (e.g., “gzip”). Default “gzip”.
compression_opts (int, optional) – Compression level if applicable. Default 4.
chunks (bool or tuple, optional) – Enable chunking (True) or provide explicit chunk shape. Default True.
include_meta (bool, optional) – If True, write JSON-encoded meta as a group attribute. Default True.

raises ImportError:

If h5py is not available.

raises ValueError:

If shapes are inconsistent.

classmethod from_pickle(path)[source]

Load a ModelGrid from a pickle file.

Parameters:: path (str) – Path to the pickle file.
Returns:: grid (ModelGrid)

to_pickle(path)[source]

Save the ModelGrid to a pickle file.

Parameters:: path (str) – Output pickle file path.

classmethod load_auto(path, **kwargs)[source]

Auto-load a ModelGrid from a file based on its extension.

Supported formats:: .fits, .fit, .fts FITS table via from_fits_table .hdf5, .h5 HDF5 via from_hdf5 .pkl, .pickle Pickle via from_pickle

Parameters:

path (str) – Path to the file.
**kwargs – Additional keyword arguments passed to the specific loader.

Returns:

grid (ModelGrid)

Candidate selection

Candidate-selection helpers for grid-based inference.

besta.grid.binning._fit_transform(dims_array: ndarray, mode: str = 'standardize', pca_variance: float = 1.0) → Dict[str, Any][source]

Fit a linear transform on dims_array and return a dict with parameters.

Parameters:

dims_array (ndarray, shape (N, D)) – Training data for selected dimensions.
mode ({“none”, “standardize”, “pca_whiten”}) – Transform type.
pca_variance (float in (0,1]) – Fraction of variance to keep if mode == “pca_whiten”.

Returns:

t (dict) – Transformation. Contains keys: mode, mu (mean), sd (standard dev.), W, b, kept_dims.

besta.grid.binning._apply_transform(X: ndarray, T: Dict[str, Any]) → ndarray[source]: Apply a transformation to a set in native space.

besta.grid.binning._chunk_ranges(n: int, batch_size: int | None)[source]: Provide the starting and end indices of a batch.

besta.grid.binning._sigma_to_space(sig_native: ndarray, T: dict) → ndarray[source]: Map 1-sigma vector from native into the transform space described by T.

class besta.grid.binning.BaseBinner[source]

Bases: object

Abstract candidate selector interface.

Required methods

fit(grid) candidates(y_native, sigmas_native=None, **kwargs) -> (indices, aux_level_or_radius) dims property save(path), load(path) for persistence batch_candidates(Y_native, SIG_native=None, **kwargs) convenience method

batch_candidates(Y_native: ndarray, SIG_native: ndarray | None = None, batch_size: int | None = None, **kwargs) → Tuple[List[ndarray], List[int | None]][source]: Select candidates in batch.

plot_candidates(grid: ModelGrid, y_native: ndarray, sigmas_native: ndarray | None = None, *, dims_plot: Sequence[int] | None = None, candidate_kwargs: dict | None = None, plot_space: str = 'native', max_background: int = 50000, background_alpha: float = 0.15, background_ms: float = 1.0, cand_alpha: float = 0.9, cand_ms: float = 4.0, query_ms: float = 10.0, show_sigma_boxes: bool = True, sigma_scale: float = 1.0, figsize: Tuple[float, float] | None = None, suptitle: str | None = None, random_state: int | None = 42)[source]

Pairwise visualisation of binner candidates versus the model grid.

Parameters:

grid (ModelGrid)
y_native (np.ndarray)
sigmas_native (np.ndarray)
dims_plot (list of int, optional)
candidate_kwargs (dict, optional)
TODO

class besta.grid.binning.RectBinner(dims: ~typing.List[int] = <property object>, levels: int = 5, base_bins: int = 6, transform: str = 'standardize', pca_variance: float = 1.0, edges_mode: str = 'quantile', mode: str = 'sigma', target_factor: float = 2.0, expand_factor: float = 2.0, target_k: int | None = None, k: int | None = None, max_expand_steps: int = 4, _T: ~typing.Dict[str, ~typing.Any] = <factory>, _edges: ~typing.List[~typing.List[~numpy.ndarray]] = <factory>, _layers: ~typing.List[~typing.Dict[~typing.Tuple[int, ...], ~numpy.ndarray]] = <factory>, _grid_n: int = 0, _D_eff: int = 0)[source]

Bases: BaseBinner

Multi-resolution rectilinear bins with optional whitening and quantile edges.

Parameters:

dims (list of int) – Observable columns to index (D dims).
levels (int) – Number of refinement levels. Bin count per dim grows ~ base_bins * 2^level.
base_bins (int) – Bins per dim at level 0.
transform ({“none”,”standardize”,”pca_whiten”}) – Linear transform fitted on training data for selected dims.
pca_variance (float in (0,1]) – Variance fraction to keep if transform == “pca_whiten”.
edges_mode ({“quantile”,”linear”}) – Strategy for bin edges per dim.

Notes

Uses quantile edges by default to equalize mass across bins.
Supports two selection modes in candidates():
1. error expansion: expand by multiples of transformed sigmas
2. target_k expansion: expand neighbourhood until at least target_k models are included

fit(grid: ModelGrid) → RectBinner[source]: Fit the binner

_choose_level_by_sigma(sig_trans: ndarray, target_factor: float = 2.0) → int[source]: Choose the binning level given an vector of uncertainties.

candidates(y_native: ndarray, sigmas_native: ndarray | None = None) → Tuple[ndarray, int][source]: Candidate selector.

class besta.grid.binning.HashedGridBinner(dims: ~typing.List[int] = <property object>, transform: str = 'standardize', pca_variance: float = 1.0, select_mode: str = 'knn', target_k: int = 100, radius_factor: float = 2.0, radius_shape: str = 'ellipsoid', sigma_floor: float = 1e-12, cell_width: ~typing.Sequence[float] | float | None = None, target_cell_occupancy: int = 32, max_expand_steps: int = 8, _T: ~typing.Dict[str, ~typing.Any] = <factory>, _Xz: ~numpy.ndarray | None = None, _cells: ~typing.Dict[~typing.Tuple[int, ...], ~numpy.ndarray] = <factory>, _origin: ~numpy.ndarray | None = None, _cell_width: ~numpy.ndarray | None = None, _coord_min: ~numpy.ndarray | None = None, _coord_max: ~numpy.ndarray | None = None)[source]

Bases: BaseBinner

Fixed-resolution hashed grid over transformed observable space.

Parameters:

dims (list of int) – Observable columns to index.
transform ({“none”,”standardize”,”pca_whiten”}) – Linear transform fitted on training dims.
pca_variance (float) – Variance fraction to keep when whitening.
select_mode ({“radius”,”knn”}) – Candidate selection mode.
target_k (int) – Number of neighbours to keep in knn mode.
radius_factor (float) – Scale factor applied to transformed sigmas in radius mode.
radius_shape ({“ball”,”ellipsoid”,”box”}) – Exact geometric filter applied after cell lookup.
cell_width (float or sequence, optional) – Per-dimension cell width in transformed space. If None, infer it from the global density and target_cell_occupancy.
target_cell_occupancy (int) – Target average number of models per occupied cell when cell widths are inferred automatically.
max_expand_steps (int) – Maximum number of cell-ring expansions in knn mode.

_resolve_cell_width(Xz: ndarray) → ndarray[source]: Compute per-dimension cell widths in transformed space.

_coords_from_points(Xz: ndarray) → ndarray[source]: Map transformed points to integer cell coordinates.

class besta.grid.binning.KDTreeBinner(dims: List[int] = <property object>, transform: str = 'standardize', pca_variance: float = 1.0, leafsize: int = 100, select_mode: str = 'knn', target_k: int = 100, radius_factor: float = 2.0, radius_shape: str = 'ellipsoid', sigma_floor: float = 1e-12, _T: Dict[str, ~typing.Any]=<factory>, _tree: cKDTree | None = None, _Xz: ndarray | None = None)[source]

Bases: BaseBinner

Nearest-neighbour binner using KDTree over transformed dims.

Parameters:

dims (list of int) – Observable columns to index (D dims).
transform ({“none”,”standardize”,”pca_whiten”}) – Linear transform to fit on training dims.
pca_variance (float) – Variance to keep if using pca_whiten.
leafsize (int) – KDTree leaf size, trade off build vs query speed.
select_mode ({“radius”,”knn”}) – Candidate selection mode.
target_k (int) – Target number of candidates for select_mode=”knn”.
radius_factor (float) – Radius scaling factor for select_mode=”radius”.

_sigma_to_tree_space(s_native: ndarray) → ndarray[source]

Map per-dimension 1-sigma uncertainties from native dims -> tree space dims.

Returns:: s_z ((D,) ndarray) – 1-sigma vector in the KDTree coordinate system (same dims as Xz).

candidates(y_native: ndarray, sigmas_native: ndarray | None = None) → Tuple[ndarray, int | None][source]: Candidate selector.

class besta.grid.binning._GridObservablesView(observables: ndarray, observable_names: List[str])[source]

Bases: object

Minimal grid-like view used to fit child binners on subsets.

class besta.grid.binning.NestedBinner(primary_dims: ~typing.List[int], secondary_dims: ~typing.List[int], primary_bins: int = 16, primary_edges_mode: str = 'quantile', primary_radius_factor: float = 2.0, secondary_kind: str = 'kdtree', secondary_params: ~typing.Dict[str, ~typing.Any] = <factory>, min_primary_count: int = 8, final_target_k: int | None = None, use_global_fallback: bool = True, _primary_edges: ~typing.List[~numpy.ndarray] = <factory>, _primary_cells: ~typing.Dict[~typing.Tuple[int, ...], ~numpy.ndarray] = <factory>, _secondary_models: ~typing.Dict[~typing.Tuple[int, ...], ~besta.grid.binning.BaseBinner] = <factory>, _secondary_global_index: ~typing.Dict[~typing.Tuple[int, ...], ~numpy.ndarray] = <factory>, _fallback_secondary: ~besta.grid.binning.BaseBinner | None = None, _X: ~numpy.ndarray | None = None, _observable_names: ~typing.List[str] = <factory>, _grid_n: int = 0)[source]

Bases: BaseBinner

Two-stage candidate selector: primary partition then secondary refinement.

Typical usage is to partition first in redshift-like dimensions, then run a color-space binner only within the selected primary bins.

static _to_jsonable(obj: Any) → Any[source]: Recursively convert NumPy types to JSON-native Python objects.

static _from_jsonable(obj: Any) → Any[source]: Recursively decode JSON-safe payload back to runtime Python objects.

Probability, likelihood, and prior models for grid-based inference.

besta.grid.prob._logsumexp(a: ndarray, axis: int | None = None) → ndarray[source]

Stable logsumexp.

Parameters:

a (ndarray) – Input array.
axis (int or None) – Axis over which to reduce. If None, reduces over all elements.

Returns:

out (ndarray) – log(sum(exp(a))) along the given axis.

besta.grid.prob._normal_logpdf(x: ndarray, mu: ndarray, sigma: ndarray) → ndarray[source]

Univariate normal log pdf per element.

Parameters:: x, mu, sigma (ndarray) – Broadcastable arrays. sigma must be positive.
Returns:: logp (ndarray) – Log density values.

besta.grid.prob._std_norm_cdf(x: ndarray) → ndarray[source]

Standard normal CDF using erf.

Parameters:: x (ndarray)
Returns:: cdf (ndarray)

class besta.grid.prob.Prior[source]

Bases: ABC

Abstract prior interface over model targets.

A prior returns log p(theta) for each model row. It may depend on specific target columns (e.g., redshift) and optionally on other model metadata.

log_prob_for_models(targets)[source]: Return log prior probability per model row.

abstractmethod log_prob_for_models(targets: ndarray) → ndarray[source]

Evaluate log prior for each model row.

Parameters:: targets (ndarray, shape (N, Q)) – Model targets array from ModelGrid.
Returns:: logp (ndarray, shape (N,)) – Log prior probability per model.

class besta.grid.prob.FlatPrior[source]

Bases: Prior

Flat prior over all models (constant log probability).

log_prob_for_models(targets: ndarray) → ndarray[source]

Evaluate log prior for each model row.

Parameters:: targets (ndarray, shape (N, Q)) – Model targets array from ModelGrid.
Returns:: logp (ndarray, shape (N,)) – Log prior probability per model.

class besta.grid.prob.GaussianPrior1D(target_col: int, mu: float, sigma: float)[source]

Bases: Prior

1-D Gaussian prior over a single target column.

Parameters:

target_col (int) – Index of the target column to build the prior on (e.g., redshift).
mu (float) – Mean of the Gaussian prior.
sigma (float) – Standard deviation of the Gaussian prior.

log_prob_for_models(targets: ndarray) → ndarray[source]

Evaluate log prior for each model row.

Parameters:: targets (ndarray, shape (N, Q)) – Model targets array from ModelGrid.
Returns:: logp (ndarray, shape (N,)) – Log prior probability per model.

class besta.grid.prob.UniformPrior1D(target_col: int, low: float, high: float)[source]

Bases: Prior

1-D uniform prior over a single target column.

Parameters:

target_col (int) – Index of the target column to build the prior on (e.g., redshift).
low (float) – Lower bound of the uniform prior.
high (float) – Upper bound of the uniform prior.

log_prob_for_models(targets: ndarray) → ndarray[source]

Evaluate log prior for each model row.

Parameters:: targets (ndarray, shape (N, Q)) – Model targets array from ModelGrid.
Returns:: logp (ndarray, shape (N,)) – Log prior probability per model.

class besta.grid.prob.DeltaPrior1D(target_col: int, value: float, tolerance: float = 1e-06)[source]

Bases: Prior

1-D delta-function prior over a single target column.

Parameters:

target_col (int) – Index of the target column to build the prior on (e.g., redshift).
value (float) – Location of the delta prior.
tolerance (float, optional) – Width around value to assign finite log-probability (default 1e-6).

log_prob_for_models(targets: ndarray) → ndarray[source]

Evaluate log prior for each model row.

Parameters:: targets (ndarray, shape (N, Q)) – Model targets array from ModelGrid.
Returns:: logp (ndarray, shape (N,)) – Log prior probability per model.

class besta.grid.prob.ExponentialPrior1D(target_col: int, scale: float)[source]

Bases: Prior

1-D exponential prior over a single target column.

Parameters:

target_col (int) – Index of the target column to build the prior on (e.g., redshift).
scale (float) – Scale parameter (mean) of the exponential prior.

log_prob_for_models(targets: ndarray) → ndarray[source]

Evaluate log prior for each model row.

Parameters:: targets (ndarray, shape (N, Q)) – Model targets array from ModelGrid.
Returns:: logp (ndarray, shape (N,)) – Log prior probability per model.

class besta.grid.prob.ExponentialTruncatedPrior1D(target_col: int, scale: float, t_max: float)[source]

Bases: Prior

1-D truncated exponential prior over a single target column.

Parameters:

target_col (int) – Index of the target column to build the prior on (e.g., redshift).
scale (float) – Scale parameter (mean) of the exponential prior.
t_max (float) – Maximum value of the target (support upper bound).

log_prob_for_models(targets: ndarray) → ndarray[source]

Evaluate log prior for each model row.

Parameters:: targets (ndarray, shape (N, Q)) – Model targets array from ModelGrid.
Returns:: logp (ndarray, shape (N,)) – Log prior probability per model.

class besta.grid.prob.PowerLawPrior1D(target_col: int, alpha: float, t_min: float, t_max: float)[source]

Bases: Prior

1-D power-law prior over a single target column.

Parameters:

target_col (int) – Index of the target column to build the prior on (e.g., redshift).
alpha (float) – Power-law index (p(t) ~ t^alpha).
t_min (float) – Minimum value of the target (support lower bound).
t_max (float) – Maximum value of the target (support upper bound).

log_prob_for_models(targets: ndarray) → ndarray[source]

Evaluate log prior for each model row.

Parameters:: targets (ndarray, shape (N, Q)) – Model targets array from ModelGrid.
Returns:: logp (ndarray, shape (N,)) – Log prior probability per model.

class besta.grid.prob.EmpiricalHistogramPrior1D(target_col: int, edges: ndarray, density_floor: float = 1e-12)[source]

Bases: Prior

Empirical 1-D histogram prior over a single target column.

This prior estimates p(t) from the model grid itself and assigns the same log prior to all models that fall in the same bin.

Parameters:

target_col (int) – Index of the target column to build the prior on (e.g., redshift).
edges (ndarray, shape (K+1,)) – Histogram bin edges. Must cover the support of the target.
density_floor (float, optional) – Minimum probability mass per bin to avoid -inf (default 1e-12).

fit_from_targets(targets: ndarray, weights: ndarray | None = None) → EmpiricalHistogramPrior1D[source]

Fit histogram from targets.

Parameters:

targets (ndarray, shape (N, Q))
weights (ndarray, shape (N,), optional)

Returns:

self (EmpiricalHistogramPrior1D)

log_prob_for_models(targets: ndarray) → ndarray[source]

Evaluate log prior for each model row.

Parameters:: targets (ndarray, shape (N, Q)) – Model targets array from ModelGrid.
Returns:: logp (ndarray, shape (N,)) – Log prior probability per model.

class besta.grid.prob.EmpiricalFlatteningPriorND(target_cols: Sequence[int], edges_list: Sequence[ndarray], mode: str = 'factorised', density_floor: float = 1e-12, count_floor: float = 0.001)[source]

Bases: Prior

Empirical flattening prior over several target columns.

This prior uses the model grid itself to estimate the (possibly non-flat) distribution of a set of parameters and builds a prior that counteracts those inhomogeneities.

Two modes are provided:

‘factorised’: build 1-D histograms for each column separately and form a product prior over dimensions. This approximately flattens the marginal distributions of those parameters.
‘joint’: build a joint N-D histogram over all selected columns and assign prior mass proportional to 1 / N_k for each occupied N-D bin. This flattens the distribution over the joint grid of bins, but is more memory hungry and can be sparse.

Parameters:

target_cols (sequence of int) – Indices of the target columns to build the prior on.
edges_list (sequence of ndarray) – List of bin edges for each column. Must have the same length as target_cols. Each element is an array of shape (K_d + 1,) defining the bin edges along dimension d.
mode ({‘factorised’, ‘joint’}, optional) – Flattening strategy (default ‘factorised’).
density_floor (float, optional) – Minimum probability mass per bin (or joint cell) to avoid -inf log-probabilities (default 1e-12). Only relevant for ‘joint’ mode.
count_floor (float, optional) – Minimum effective count per bin to avoid infinite weights (default 1e-3). Used to clip empty or nearly empty bins.

Notes

The prior is defined over models, not over a continuous parameter space. It is intended to compensate for non-uniform sampling of the grid, so that the effective prior over the chosen parameters is closer to flat.

fit_from_targets(targets: ndarray, weights: ndarray | None = None) → EmpiricalFlatteningPriorND[source]

Fit flattening prior from the model grid targets.

Parameters:

targets (ndarray, shape (N, Q)) – Model grid targets.
weights (ndarray, shape (N,), optional) – Optional model weights (e.g. importance weights) used to define the effective occupancy per bin.

Returns:

self (EmpiricalFlatteningPriorND)

log_prob_for_models(targets: ndarray) → ndarray[source]

Compute log prior for each model in the given targets array.

Parameters:: targets (ndarray, shape (N, Q)) – Model grid targets.
Returns:: logp (ndarray, shape (N,)) – Log prior for each target row.

class besta.grid.prob.ObservableDependentPrior[source]

Bases: Prior

TODO

class besta.grid.prob.MagDependentRedshiftPrior(z_col: int, mag_observable_index: int, z_edges: ndarray, m_edges: ndarray, density_floor: float = 1e-12)[source]

Bases: ObservableDependentPrior

Magnitude-dependent redshift prior p(z | m) from a 2-D histogram.

Parameters:

z_col (int) – Index of redshift in targets.
mag_observable_index (int) – Index of magnitude in observables (e.g., VIS magnitude column).
z_edges (ndarray) – Bin edges in redshift.
m_edges (ndarray) – Bin edges in magnitude.
density_floor (float, optional) – Minimum conditional probability per (z bin) to avoid zeros.

Notes

After calling fit_from_grid, the conditional log prior is defined by log p(z_k | m_bin) for each magnitude bin. For a given model row, its magnitude chooses a column in the 2-D histogram and the redshift of that model chooses the row.

fit_from_grid(observables: ndarray, targets: ndarray, weights: ndarray | None = None) → MagDependentRedshiftPrior[source]

Fit conditional histogram from the model grid.

Parameters:

observables (ndarray, shape (N, P))
targets (ndarray, shape (N, Q))
weights (ndarray, shape (N,), optional)

Returns:

self (MagDependentRedshiftPrior)

log_prob_for_models(targets: ndarray, observables: ndarray | None = None) → ndarray[source]

Evaluate log p(z | m) per model row.

Parameters:

targets (ndarray, shape (N, Q))
observables (ndarray, shape (N, P), optional) – Required for the magnitude column. If None, raises.

Returns:

logp (ndarray, shape (N,))

class besta.grid.prob.HierarchicalPrior(hyperparams: dict)[source]

Bases: Prior

Abstract base class for priors controlled by learnable hyperparameters.

abstractmethod log_prob_for_models(targets: ndarray, **kwargs) → ndarray[source]

Evaluate log prior for each model row.

Parameters:: targets (ndarray, shape (N, Q)) – Model targets array from ModelGrid.
Returns:: logp (ndarray, shape (N,)) – Log prior probability per model.

class besta.grid.prob.CompositePrior(priors: Sequence[Prior], weights: Sequence[float] | None = None)[source]

Bases: Prior

Composite prior combining multiple target-only Priors.

The total log prior is defined as a weighted sum of component log priors:

log p_total(model) = sum_i w_i * log p_i(model)

where each p_i is a Prior that does not depend on observables.

Parameters:

priors (sequence of Prior) – List or tuple of prior instances to combine. Each must implement log_prob_for_models(targets) and must NOT require observables.
weights (sequence of float, optional) – Per-component multiplicative weights for the log-probabilities. If None, all weights are taken as 1.0. A zero weight effectively disables a component. Negative weights are allowed but should be used with care, as they correspond to dividing by p_i(model) in linear space.

log_prob_for_models(targets: ndarray) → ndarray[source]

Evaluate the composite log prior for each model row.

Parameters:: targets (ndarray, shape (N, Q)) – Model targets array.
Returns:: logp (ndarray, shape (N,)) – Composite log prior per model.

class besta.grid.prob.ObservableCompositePrior(priors: Sequence[Prior], weights: Sequence[float] | None = None)[source]

Bases: ObservableDependentPrior

Composite prior combining Priors, including observable-dependent ones.

The total log prior is defined as a weighted sum of component log priors:

log p_total(model) = sum_i w_i * log p_i(model)

Components can be:

Plain Prior (target-only), evaluated as
p_i.log_prob_for_models(targets)
ObservableDependentPrior, evaluated as
p_i.log_prob_for_models(targets, observables=…)

Parameters:

priors (sequence of Prior) – List or tuple of prior instances to combine. Each must implement log_prob_for_models(…) following the Prior or ObservableDependentPrior API.
weights (sequence of float, optional) – Per-component multiplicative weights for the log-probabilities. If None, all weights are taken as 1.0. A zero weight effectively disables a component. Negative weights are allowed but should be used with care, as they correspond to dividing by p_i(model) in linear space.

Notes

Because this class inherits from ObservableDependentPrior, external code (e.g. GridFitter / posterior_over_models) should treat it as requiring observables and pass them when available.
If the composite contains any ObservableDependentPrior and observables is None, a ValueError is raised.

log_prob_for_models(targets: ndarray, observables: ndarray | None = None) → ndarray[source]

Evaluate the composite log prior for each model row.

Parameters:

targets (ndarray, shape (N, Q)) – Model targets array.
observables (ndarray, shape (N, P), optional) – Model observables. Required if any component is an ObservableDependentPrior. If such a prior is present and observables is None, a ValueError is raised.

Returns:

logp (ndarray, shape (N,)) – Composite log prior per model.

class besta.grid.prob.Likelihood[source]

Bases: ABC

Abstract likelihood interface p(x | model).

log_likelihood(x_native, sigma_native, X_models)[source]: Return log likelihood per candidate model row.

abstractmethod log_likelihood(x_native: ndarray, sigma_native: ndarray, X_models: ndarray) → ndarray[source]

Evaluate log likelihood for each model.

Parameters:

x_native (ndarray, shape (P,)) – Query observables.
sigma_native (ndarray, shape (P,)) – Measurement uncertainties for the query.
X_models (ndarray, shape (Nc, P)) – Candidate model observables.

Returns:

logL (ndarray, shape (Nc,))

class besta.grid.prob.GaussianProductLikelihood(bandwidth_floor: float = 0.0, scale: float = 1.0)[source]

Bases: Likelihood

Independent per-dimension Gaussian product likelihood.

The likelihood is proportional to the product over j of N(x_j | X_ij, h_j^2), where h_j is derived from sigma_native with an optional floor.

Parameters:

bandwidth_floor (float, optional) – Minimum bandwidth per dimension in native units. Default 0.0.
scale (float, optional) – Multiplicative scale applied to sigma_native. Default 1.0.

log_likelihood(x_native: ndarray, sigma_native: ndarray, X_models: ndarray) → ndarray[source]

Evaluate log likelihood for each model.

Parameters:

x_native (ndarray, shape (P,)) – Query observables.
sigma_native (ndarray, shape (P,)) – Measurement uncertainties for the query.
X_models (ndarray, shape (Nc, P)) – Candidate model observables.

Returns:

logL (ndarray, shape (Nc,))

class besta.grid.prob.CensoredSizeLikelihood(phot_indices: Sequence[int], size_index: int, s_min: float, bandwidth_floor: float = 0.0, scale: float = 1.0)[source]

Bases: Likelihood

Photometry-only Gaussian product with a left-censored size factor.

This is useful when the apparent size is below a reliability floor (e.g., PSF or measurement threshold). The photometric part is a Gaussian product over selected photometry indices. The size part adds a log CDF factor log Phi((s_min - s_model) / h_s), where s is log10(Re) and h_s is derived from sigma_native[size_index].

Parameters:

phot_indices (Sequence[int]) – Indices of observable columns to include in the Gaussian product (typically colours and anchor magnitude).
size_index (int) – Index of the size observable column (e.g., log10(Re)).
s_min (float) – Left-censoring threshold in the same units as the size observable.
bandwidth_floor (float, optional) – Minimum bandwidth per dimension in native units. Default 0.0.
scale (float, optional) – Multiplicative scale applied to sigma_native. Default 1.0.

log_likelihood(x_native: ndarray, sigma_native: ndarray, X_models: ndarray) → ndarray[source]

Evaluate log likelihood for each model.

Parameters:

x_native (ndarray, shape (P,)) – Query observables.
sigma_native (ndarray, shape (P,)) – Measurement uncertainties for the query.
X_models (ndarray, shape (Nc, P)) – Candidate model observables.

Returns:

logL (ndarray, shape (Nc,))

class besta.grid.prob.CompositeLikelihood(terms: List[Likelihood])[source]

Bases: Likelihood

Sum of multiple likelihood terms (log-likelihoods add).

Parameters:: terms (list of Likelihood) – Components to add together.

log_likelihood(x_native: ndarray, sigma_native: ndarray, X_models: ndarray) → ndarray[source]

Evaluate log likelihood for each model.

Parameters:

x_native (ndarray, shape (P,)) – Query observables.
sigma_native (ndarray, shape (P,)) – Measurement uncertainties for the query.
X_models (ndarray, shape (Nc, P)) – Candidate model observables.

Returns:

logL (ndarray, shape (Nc,))

besta.grid.prob.posterior_over_models(x_native: ndarray, sigma_native: ndarray, X_models: ndarray, targets_models: ndarray, likelihood: Likelihood, prior: Prior, model_weights: ndarray | None = None, prior_needs_observables: bool = False, observables_models: ndarray | None = None) → ndarray[source]

Compute normalised posterior weights over models.

Parameters:

x_native (ndarray, shape (P,)) – Query observables.
sigma_native (ndarray, shape (P,)) – Measurement uncertainties for the query.
X_models (ndarray, shape (Nc, P)) – Candidate model observables.
targets_models (ndarray, shape (Nc, Q)) – Candidate model targets.
likelihood (Likelihood) – Likelihood instance to evaluate log p(x | model).
prior (Prior) – Prior instance to evaluate log p(model).
model_weights (ndarray, shape (Nc,), optional) – Optional sampling weights for models; multiplies the posterior.
prior_needs_observables (bool, optional) – If True, the prior expects observables to evaluate (for p(z|m)).
observables_models (ndarray, shape (Nc, P), optional) – Candidate observables to pass to the prior if needed.

Returns:

w (ndarray, shape (Nc,)) – Normalised posterior weights over candidate models.

class besta.grid.prob.NumbaGaussianProductLikelihood(bandwidth_floor: float = 0.0, scale: float = 1.0, prefer_batch: bool = False)[source]

Bases: GaussianProductLikelihood

Diagonal Gaussian product likelihood accelerated with Numba.

log L_i = -0.5 * sum_j ((X_ij - x_j)/h_j)^2

log_likelihood(x_native, sigma_native, X_models)[source]

Evaluate log likelihood for each model.

Parameters:

x_native (ndarray, shape (P,)) – Query observables.
sigma_native (ndarray, shape (P,)) – Measurement uncertainties for the query.
X_models (ndarray, shape (Nc, P)) – Candidate model observables.

Returns:

logL (ndarray, shape (Nc,))

Lightweight transforms used by model grids and emulators.

class besta.grid.transforms.LinearStandardiser(mean: ndarray | None = None, sd: ndarray | None = None)[source]

Bases: object

Simple per-dimension standardisation: x_std = (x - mean) / sd.

Notes

Uses ddof=0 by default (population std), matching your ModelGrid.
If sd == 0, it is set to 1 to avoid division by zero.

class besta.grid.transforms.MagTransform(zero_point: float = 0.0, eps: float = 1e-30)[source]

Bases: object

Magnitude transform.

mag = -2.5 log10(flux) + zero_point flux = 10^((zero_point - mag)/2.5)

Abstract interfaces for generating besta.grid.grid.ModelGrid objects.

class besta.grid.generator.ModelGridGenerator[source]

Bases: ABC

Abstract interface for producing a ModelGrid from inputs.

Implementations may synthesise spectra, SEDs, photometry, or any observable set, and attach associated continuous targets such as SFR, sSFR, mass, metallicity, age, or redshift.

Subclasses should implement the generate method.

abstractmethod generate(**kwargs) → ModelGrid[source]

Build and return a ModelGrid.

Parameters:: **kwargs – Implementation-specific keyword arguments such as parameter ranges, library choices, and resolution settings.
Returns:: grid (besta.grid.grid.ModelGrid) – A populated model grid instance.

Model-grid emulator utilities.

class besta.grid.emulator.TransformPack(target_names: List[str], observable_names: List[str], target_standardiser: LinearStandardiser, y_standardiser: LinearStandardiser | None = None)[source]

Bases: object

Serializable inference-time transforms.

Standardises targets -> features X
Optionally standardises y (in whatever user-chosen y_space)

classmethod from_grid(grid: Any, *, y_space: str, standardize_y: bool = True, y_for_stats: ndarray | None = None, stats_mask: ndarray | None = None) → TransformPack[source]

Build transforms from a ModelGrid.

Notes

Fits target_standardiser on grid.targets (or mask).
Fits y_standardiser on y_for_stats if provided, else:
- if y_space == “flux”: uses grid.observables (or mask)
- if y_space == “mag” : you MUST pass y_for_stats (magnitude array) explicitly to avoid baking mag logic into ModelGrid/emulator.

class besta.grid.emulator.EmulatorConfig(y_space: str = 'mag', standardize_y: bool = True, predict_batch_size: int = 131072, version: str = 'EmulatorV5', error_model: str = 'zero', error_floor: float = 0.0)[source]

Bases: object

Configuration container for Emulator predictions and errors.

class besta.grid.emulator.Emulator(model: Any, transforms: TransformPack, config: EmulatorConfig | None = None)[source]

Bases: object

Generic emulator.

Parameters:

model (Any) – Trained regression model with a .predict(X) method.
transforms (TransformPack) – Transforms for targets -> X and model_space -> y_space.
config (EmulatorConfig, optional) – Emulator configuration.

predict(targets, return_model_space=False) → np.ndarray[source]: Predict observables in y_space for given targets.

predict_error(targets) → np.ndarray[source]: Predict errors in y_space for given targets.

predict_with_error(targets) → Tuple[np.ndarray, np.ndarray][source]: Predict observables and errors in y_space for given targets.

save(outdir, name='emulator', compress=('lz4', 3)) → Dict[str, str][source]: Save the emulator to disk.

load(outdir, name='emulator') → Emulator[source]: Load an emulator from disk.

class besta.grid.emulator.EnsembleEmulator(members: List[Emulator], error_floor: float = 0.0)[source]

Bases: object

Ensemble wrapper.

predict() -> mean prediction in y_space
predict_error() -> std across members in y_space (+ optional floor)

Model fitting

class besta.grid.grid.GridFitter(grid, likelihood: Likelihood | None = None, prior: Prior | None = None, use_standardised: bool = True)[source]

Bases: object

Bayesian fitter over a ModelGrid using optinal Prior and Likelihood.

The fitter computes posterior weights over candidate models by combining a user-provided likelihood p(x | model) with a prior p(model). It can then histogram targets to obtain marginal posteriors.

Parameters:

grid (ModelGrid) – Training model grid that defines the observable and target spaces.
likelihood (Likelihood, optional) – Likelihood instance. Default is GaussianProductLikelihood.
prior (Prior, optional) – Prior instance. Default is FlatPrior.
use_standardised (bool, optional) – If True and the ModelGrid supports standardisation, evaluation can be done in standardised space for internal transforms. The likelihood is still called with native units unless you adapt it. Default True.

Notes

Posterior over models is computed as:

\[\log w_i = \log p(x \mid \mathrm{model}_i) + \log p(\mathrm{model}_i) + \log w_i^{\mathrm{model}}\]

and then normalised to sum to one over the candidate set.

posterior_over_models(x_native: ndarray, sigma_native: ndarray, candidate_idx: ndarray | None = None) → ndarray[source]

Compute posterior weights over candidate models for one query.

This method evaluates log p(x | model) + log p(model) for a set of candidate models and returns the normalised weights. If use_standardised is True, observables are transformed to the grid’s standardised space for likelihood evaluation and the input uncertainties are mapped to that same space. Priors that depend on observables always receive native (non-standardised) observables.

Parameters:

x_native (ndarray, shape (P,)) – Query observables in native units and ordering matching grid.observable_names.
sigma_native (ndarray, shape (P,)) – Per-dimension measurement uncertainties for the query in native units. If use_standardised is True, these are internally divided by the grid standard deviations so they live in the same space as the standardised observables.
candidate_idx (ndarray of int, optional) – Indices of candidate models to consider. If None, all models in the grid are used.

Returns:

w (ndarray, shape (Nc,)) – Posterior weights over the candidate set, normalised to sum to 1.

Raises:

RuntimeError – If use_standardised is True and the grid standardiser has not been fitted (fit_standardiser must be called before).
ValueError – If any derived bandwidth (from sigma) is non positive or not finite, depending on the Likelihood implementation.

Notes

1. Consistent spaces: when use_standardised is True, the likelihood is evaluated with standardised observables and uncertainties (x_eval, X_eval, sigma_eval). When False, evaluation is done in native space. 2. Observable dependent priors: priors that require observables are passed native observables to avoid feeding z-scored magnitudes or colours into priors defined in native units (for example p(z | VIS)). 3. Model weights: if the grid has per-model sampling weights, they are multiplied into the posterior before normalisation.

See also

posterior_over_models_fn: Backend routine that combines likelihood and prior.

posterior_over_target(x_native: ndarray, sigma_native: ndarray, target_col: int | str, bins: ndarray, candidate_idx: ndarray | None = None) → Tuple[ndarray, ndarray][source]

Compute a histogrammed posterior over one target.

Parameters:

x_native (ndarray, shape (P,)) – Query observables in native units.
sigma_native (ndarray, shape (P,)) – Per-dimension uncertainties in native units.
target_col (int or str) – Column index or name in grid.targets.
bins (ndarray, shape (K+1,)) – Bin edges.
candidate_idx (ndarray of int, optional) – Candidate model indices.

Returns:

post (ndarray, shape (K,)) – Posterior over bins (sums to one).
centers (ndarray, shape (K,)) – Bin centers.

fit_batch(X_native: ndarray, SIG_native: ndarray, binner: Any | None = None, n_jobs: int = 1, backend: str = 'thread', stats_for: Sequence[int | str] | None = None, stats_bins: Sequence[ndarray] | None = None, find_multimodal: bool = False, use_kde_for_stats: bool = False, return_pdf_for_stats: bool = False, verbose: bool = True, max_memory_gb: float | None = 16.0, memcheck_sample: int = 256, safety_margin: float = 1.2, tasks_per_worker: int | None = None, dry_run: bool = False, posterior_keep_mass: float | None = 0.99, posterior_keep_min_candidates: int = 128, posterior_keep_max_candidates: int | None = None, posterior_keep_ties: bool = False, return_mode: str = 'iter', output_hdf5_path: str | None = None, output_hdf5_group: str = '/fit_batch', output_hdf5_overwrite: bool = False, output_hdf5_compression: str | None = 'gzip', output_hdf5_compression_opts: int = 4, output_hdf5_flush_every: int = 256, output_hdf5_write_only: bool = False) → Iterator[Dict[str, Any]] | List[Dict[str, Any]][source]

Evaluate model posteriors for multiple queries.

Workflow

The computation is split into three stages:

Step 1 — Candidate selection: For each query m, find a set of candidate model indices cands[m] using the optional binner. If binner is None or returns an empty set, fall back to the full grid. Work is submitted as slices of contiguous queries to a thread pool. Slice sizing is heuristic: it scales inversely with the observable dimension P so that each task has a reasonable amount of work without overscheduling.
Step 2 — Posterior over models: For each query m, compute posterior weights over cands[m] by combining likelihood and prior (plus per-model weights). If backend="process", a picklability check is performed and the code falls back to threads when needed. Slices are cost-balanced using a cheap proxy cost[m] ≃ len(cands[m]) * P so that each future does a similar amount of numerical work.
Step 3 — Per-target statistics: Optionally, for each requested target, histogram the posterior and compute summary stats (mean, std, map, central intervals, quantiles, modal count). This stage is NumPy-bound and relatively light, so it always uses a thread pool and reuses Step-2 slices (or a similarly sized partition).

Parallel task sizing

The number of tasks submitted to the executor is controlled by tasks_per_worker. By default, it mirrors the current behaviour (~ 5 tasks per worker). Larger values create more, smaller tasks (better load-balancing, higher scheduler overhead). Smaller values create fewer, larger tasks (lower overhead, potentially less balanced).

Memory safety

The previous explicit memory pre-check is currently disabled in this implementation. For large runs, prefer: - candidate selection via binner, - posterior truncation controls, - return_mode="iter", - HDF5 output with output_hdf5_write_only=True when appropriate.

Parameters:

X_native (ndarray, shape (M, P)) – Query observables in native units. Order must match the grid.
SIG_native (ndarray, shape (M, P)) – Per-dimension uncertainties in native units.
binner (object, optional) – Must expose .dims and .candidates(y_native, sigmas_native). All configuration (e.g. sigma/target-based selection) is handled internally by the binner itself.
n_jobs (int, optional) – Maximum concurrency (threads or processes, per backend).
backend ({“thread”,”process”}, optional) – Execution backend for Step 2 (posterior evaluation). Steps 1 and 3 always use threads. If objects are not picklable with “process”, we fall back to “thread”.
stats_for (sequence of {int,str}, optional) – Target columns (indices or names) to summarise. If None, Step 3 is skipped.
stats_bins (sequence of ndarray, optional) – Per-target bin edges. Must have same length as stats_for if provided.
find_multimodal (bool, optional) – If True, attempt to count posterior modes in each 1D histogram.
return_pdf_for_stats (bool, optional) – If True, return the full (M, K) discrete posterior for each requested target (increases memory).
verbose (bool, optional) – Print progress messages.
max_memory_gb (float or None, optional) – Maximum allowed additional memory. None disables the check.
memcheck_sample (int, optional) – Number of pilot queries for candidate-size estimation in the memory check (upper bound).
safety_margin (float, optional) – Multiplier applied to the memory estimate before comparing to the allowed/available memory.
tasks_per_worker (int or None, optional) – Controls how many tasks (slices) are queued per worker for each stage. Default None behaves like the previous version (~5). Increase to improve load balancing for heterogeneous costs; decrease to reduce scheduling overhead.
dry_run (bool, optional) – If True, perform only the memory check and return an empty dict.

returns:

iterator or list of dict – Controlled by return_mode: - "iter": returns an iterator yielding one result dict per query. - "list": returns a list with one result dict per query.

Each result dict contains: - "m": query index - "candidates": kept candidate indices - "post_models": posterior weights over kept candidates - "level": binner level (or None) - "truncation": truncation metadata - optional "stats": per-target summary stats - optional "posts_target": per-target posterior over bins

Notes

When use_standardised=True, likelihoods are evaluated in the grid’s standardised space, but observable-dependent priors receive native observables. If SIG_native contains zeros/near-zeros, pre-clip or configure a positive floor in your Likelihood to avoid degenerate bandwidths.

corner_for_targets(x_native: ndarray, sigma_native: ndarray, target_cols: Sequence[int | str], true_target_vals: ndarray | None = None, candidate_idx: ndarray | None = None, bins: int | Sequence[int | ndarray] = 50, figsize: Tuple[float, float] | None = None, suptitle: str | None = None, color: str | None = None, kappa_sigma_edges: float = 5.0, alpha_hist: float = 0.6, alpha_mesh: float = 1.0, quantiles: Tuple[float, float, float] = (0.16, 0.5, 0.84)) → Tuple[Figure, ndarray, Dict[str, Any]][source]

Corner plot for selected targets using model posterior weights.

Parameters:

x_native (ndarray, shape (P,))
sigma_native (ndarray, shape (P,))
target_cols (sequence of int or str) – Target columns to include.
true_target_vals (ndarray of float, optional) – True target values.
candidate_idx (ndarray of int, optional)
bins (int or sequence) – Common bin count, or per-dimension specs (int or edges).
figsize (tuple, optional)
suptitle (str, optional)
color (str, optional)
alpha_hist (float, optional)
alpha_mesh (float, optional)
quantiles (tuple, optional)

Returns:

fig, axes, summary (Figure, Axes array, dict)

class besta.grid.grid.GridFitHDF5Writer(path: str, group: str, *, M: int, P: int, configuration: Dict[str, ~typing.Any] | None=None, stats_specs: Sequence[GridFitStatsSpec] | None = None, return_pdf_for_stats: bool = False, overwrite: bool = False, compression: str | None = 'gzip', compression_opts: int = 4, flush_every: int = 256, dtype_posts_target=<class 'numpy.float32'>, dtype_post_models=<class 'numpy.float64'>)[source]

Bases: object

HDF5 writer for GridFitter.fit_batch streaming outputs.

Layout written under <group>:

/configuration           attrs['json'] with a JSON blob
/index/m                 (M,) int64  [0..M-1]
/index/status            (M,) int8   [0=not written, 1=ok, -1=failed]
/truncation/*            (M,) scalar datasets (level, n_before, n_after, ...)
/candidates/candidates   (M,) vlen int64
/candidates/post_models  (M,) vlen float64
/stats/<key>/*           per-target scalar arrays (M,) and optional posts_target (M,K)

Notes

Writes are done by index m in each result dict, so results may arrive unsorted.
Designed for use from the main thread/process: collect worker results, then call writer.write(r) as they arrive.

write(r: Dict[str, Any]) → None[source]

Write one result dict produced by the streaming fit_batch.

Required keys:

‘m’ (int)
‘level’ (int or None)
‘candidates’ (ndarray int)
‘post_models’ (ndarray float, sums to 1)
‘truncation’ (dict with n_before, n_after, mass_kept, mass_dropped, cut_weight, ess_before, ess_after)

Optional:

‘stats’ : dict keyed by target name
‘posts_target’ : dict keyed by target name with (K,) posterior arrays (if enabled)

mark_failed(m: int) → None[source]: Mark one object as failed (status=-1). Useful if you catch exceptions externally.

_write_stats_scalars(m: int, key: str, st: Dict[str, Any]) → None[source]: Write scalar stats for one target. Missing targets are ignored (allows partial stats).

_write_posts_target(m: int, key: str, post: Any) → None[source]: Write the discrete posterior over bins for one target (shape (K,)). If the dataset doesn’t exist or shapes mismatch, raises ValueError.

_write_stat_cov(m: int, st: Any) → None[source]

Write the covariance matrix for object m.

Expected input formats

st can be:

a dict with key “cov” holding an array-like (D, D), OR
directly an array-like (D, D)

Where D == len(self._stats_specs) and matches /stats/cov/keys ordering.

If the covariance dataset is not present (e.g. no stats_specs), this is a no-op.

Postprocessing

Post-processing utilities for BESTA results.

This module provides: - I/O for CosmoSIS-like tabular results. - Weighted posterior summaries (MAP, mean, covariance, correlation). - Weighted quantiles and (optionally multi-modal) HDI intervals. - 1D PDFs (histogram-derived + optional KDE). - 2D PDFs (KDE fallback to histogram) + HPD enclosed-fraction maps. - A structured ResultsSummary object with FITS + JSON export helpers.

Notes

This module assumes the posterior samples are stored in an Astropy Table with a log-posterior column (default: “post”) and parameter columns named like “section–name” (default delimiter/prefix: “–“).

besta.postprocess._as_float_array(x) → ndarray[source]: Conversion to float ndarray (handles Astropy Column/Quantity).

besta.postprocess.normalize_weights(weights: ndarray, *, allow_all_zero: bool = False) → ndarray[source]: Normalize weights to sum=1 over finite entries.

besta.postprocess.weighted_mean(x: ndarray, weights: ndarray, axis: int = -1) → ndarray[source]: Weighted mean along axis.

besta.postprocess.weighted_covariance(x: ndarray, weights: ndarray, *, unbiased: bool = False) → ndarray[source]

Weighted covariance for x with shape (D, N) and weights with shape (N,).

If unbiased=True, applies the common correction for normalized weights:: cov /= (1 - sum(w^2))

besta.postprocess.covariance_to_correlation(cov: ndarray) → ndarray[source]: Convert covariance matrix to correlation matrix.

besta.postprocess.weighted_quantile(x: ndarray, weights: ndarray, q: Sequence[float]) → ndarray[source]

Weighted quantiles for 1D sample x with weights.

Parameters:

x (array-like, shape (N,))
weights (array-like, shape (N,))
q (sequence of quantiles in [0, 1])

Returns:

quantiles (ndarray shape (len(q),))

besta.postprocess.weighted_hdi(x: ndarray, weights: ndarray, *, mass: float = 0.68, max_intervals: int = 2, merge_tol: float = 0.0) → List[Tuple[float, float]][source]

Compute weighted highest-density interval(s) from 1D samples.

The output approximates the smallest region containing mass probability, allowing multi-modality via disjoint intervals.

Method

Sort samples by x.
Use cumulative weighted mass.
Find the narrowest interval(s) with enclosed mass >= mass.
Optionally repeat to recover additional disjoint intervals.

Parameters:

x, weights (arrays of shape (N,))
mass (target probability mass in (0,1])
max_intervals (maximum number of disjoint intervals to return (approximate))
merge_tol (merge intervals whose gap is <= merge_tol (in x units))

returns:

intervals (list of (low, high))

besta.postprocess.enclosed_fraction_map(density: ndarray, *, xedges: ndarray | None = None, yedges: ndarray | None = None) → ndarray[source]

Compute an HPD-style enclosed fraction map from a 2D density grid.

Returns an array F with the same shape as density where each pixel stores the cumulative enclosed mass at that density threshold (after sorting pixels by density in descending order).

If xedges and yedges are provided, pixel areas are included when computing mass; otherwise constant pixel area is assumed.

besta.postprocess.histogram_pdf_1d(x: ndarray, weights: ndarray, *, bins: int = 100, range: Tuple[float, float] | None = None) → Tuple[ndarray, ndarray][source]

1D PDF from weighted histogram.

Returns:

centers ((bins,))
pdf ((bins,), normalized to integrate to 1 over x)

besta.postprocess.kde_pdf_1d(x: ndarray, weights: ndarray, grid: ndarray) → ndarray[source]: KDE PDF on a provided grid; returns NaNs on failure.

besta.postprocess.kde_or_hist_pdf_2d(x: ndarray, y: ndarray, weights: ndarray, *, bins: int = 80, range: Tuple[Tuple[float, float], Tuple[float, float]] | None = None, use_kde: bool = True) → Tuple[ndarray, ndarray, ndarray][source]

2D PDF on a regular grid. Returns (x_centers, y_centers, pdf[y,x]).

Uses KDE if possible (and use_kde), otherwise falls back to weighted histogram2d.

besta.postprocess.read_results_file(path: str, *, delimiter: str = '\t') → Table[source]

Read a CosmoSIS-style text results file: - First line is a commented header starting with ‘#’ - Columns are delimiter-separated - Remaining lines numeric

Returns an Astropy Table with lowercase column names.

besta.postprocess._split_param_key(key: str, prefix: str = '--') → Tuple[str, str][source]

Split a parameter key into (section, name) using the delimiter/prefix.

If the key cannot be split, returns (“”, key).

besta.postprocess.effective_sample_size(weights: ndarray) → float[source]: Return the standard importance-sampling effective sample size.

besta.postprocess.laplace_logz(loglike_map: float, logprior_map: float, cov: ndarray) → EvidenceEstimate[source]: Estimate log-evidence with a Laplace approximation around the MAP point.

besta.postprocess.harmonic_mean_logz(loglike: ndarray, weights: ndarray, *, trim_frac: float = 0.01) → EvidenceEstimate[source]

Robust harmonic-mean estimator.

Uses log Z = -log(E_posterior[exp(-logL)]).

This estimator is generally unstable; trimming helps, but it should still be treated as a sanity check only.

class besta.postprocess.EvidenceEstimate(method: str, logz: float, logz_err: float | None = None, details: Dict[str, Any] = None)[source]

Bases: object

Container for a scalar evidence estimate and method-specific metadata.

class besta.postprocess.ResultsSummary(parameter_keys: ~typing.List[str], posterior_key: str = 'post', parameter_sections: ~typing.List[str] = <factory>, parameter_names: ~typing.List[str] = <factory>, n_samples: int = 0, map_index: int = -1, weights: ~numpy.ndarray = <factory>, logpost: ~numpy.ndarray = <factory>, samples: ~numpy.ndarray = <factory>, map: ~numpy.ndarray = <factory>, mean: ~numpy.ndarray = <factory>, covariance: ~numpy.ndarray = <factory>, correlation: ~numpy.ndarray = <factory>, percentiles: ~typing.List[float] = <factory>, percentiles_values: ~numpy.ndarray = <factory>, percentiles_logpost: ~numpy.ndarray = <factory>, hdi_mass: float = 0.68, hdi_intervals: ~typing.Dict[str, ~typing.List[~typing.Tuple[float, float]]] = <factory>, evidence: ~besta.postprocess.EvidenceEstimate | None = None, pdf_1d: ~typing.Dict[str, ~typing.Dict[str, ~numpy.ndarray]] = <factory>, pdf_2d: ~typing.Dict[~typing.Tuple[str, str], ~typing.Dict[str, ~numpy.ndarray]] = <factory>, extra_info: ~typing.Dict[str, ~typing.Any] = <factory>)[source]

Bases: object

Container for posterior summary products.

parameter_keys

Type:: list of full parameter keys (e.g. “sfh–tau”)

parameter_names

Type:: list of short names (e.g. “tau”)

parameter_sections

Type:: list of sections (e.g. “sfh”)

posterior_key

Type:: log-posterior column name used

map_index

Type:: index of MAP (maximum log-posterior) sample among the filtered samples

n_samples

Type:: number of samples used after filtering

weights

Type:: normalized posterior weights, shape (n_samples,)

logpost

Type:: log posterior values, shape (n_samples,)

samples

Type:: array shape (n_params, n_samples)

map

Type:: vector shape (n_params,)

mean

Type:: vector shape (n_params,)

covariance

Type:: matrix shape (n_params, n_params)

correlation

Type:: matrix shape (n_params, n_params)

percentiles

Type:: list of quantiles in [0,1]

percentiles_values

Type:: array shape (n_params, n_percentiles)

percentiles_logpost

Type:: array shape (n_params, n_percentiles)

hdi_mass

Type:: mass used for HDI

hdi_intervals

Type:: dict short_name -> list of (low, high)

pdf_1d

Type:: dict short_name -> dict with keys: grid, hist_pdf, kde_pdf

pdf_2d

Type:: dict (short_i, short_j) -> dict with keys: xgrid, ygrid, pdf, enclosed_fraction

extra_info

Type:: arbitrary metadata to include in exports

estimate_evidence_from_table(table: Table, *, logpost_key: str | None = None, logprior_key: str = 'prior', loglike_key: str | None = None, method: str = 'laplace', hme_trim_frac: float = 0.01, parameter_prefix: str = '--') → EvidenceEstimate[source]

Estimate evidence logZ using information in the original results table.

Supported methods are "laplace" (MAP loglike/logprior + covariance) and "hme_robust" (posterior expectation of 1/L, sanity-check).

If loglike_key is not available, log-likelihood is reconstructed as loglike = post - prior.

to_json_dict() → Dict[str, Any][source]: Convert summary to a JSON-serializable dictionary (numpy arrays -> lists).

write_json(path: str, *, overwrite: bool = True, indent: int = 2) → str[source]: Write a JSON summary file.

to_fits() → HDUList[source]: Build a FITS HDUList with: - PrimaryHDU: extra_info - ImageHDU: COVARIANCE (+ header with means/MAP) - ImageHDU: CORRELATION - BinTableHDU: PERCENTILES - BinTableHDU: PDF1D - ImageHDUs: PDF2D_<i>__<j> and ENCFRAC_<i>__<j>

write_fits(path: str, *, overwrite: bool = True) → str[source]: Write FITS summary file.

plot_1d_pdfs(outdir: str, *, show: bool = False, dpi: int = 200) → List[str][source]: Save 1D PDF plots for all parameters.

plot_2d_pdfs(outdir: str, *, levels: Sequence[float] = (0.68, 0.95), show: bool = False, dpi: int = 200) → List[str][source]: Save 2D PDF plots with HPD enclosed-fraction contours.

corner_plot(outpath: str, *, bins: int = 50, max_points: int = 20000, dpi: int = 200, show: bool = False) → str[source]

Lightweight corner plot without external dependencies.

Diagonal: 1D hist PDFs (weighted) Off-diagonal: scatter of (subsampled) points colored by weight rank (simple)

For serious usage, consider adding an optional dependency later (corner/arviz), but this is a decent built-in baseline.

besta.postprocess.summarize_results(table: Table, *, output_fits: str | None = None, output_json: str | None = None, parameter_prefix: str = '--', posterior_key: str = 'post', parameter_keys: Sequence[str] | None = None, percentiles: Sequence[float] = (0.05, 0.16, 0.5, 0.84, 0.95), hdi_mass: float = 0.68, compute_1d: bool = True, compute_2d: bool = False, parameter_key_pairs: Sequence[Tuple[str, str]] | None = None, pdf_bins_1d: int = 100, pdf_bins_2d: int = 80, estimate_evidence: bool = False, evidence_method: str = 'laplace', logprior_key: str = 'prior', loglike_key: str | None = None, hme_trim_frac: float = 0.01, kde_1d: bool = True, kde_2d: bool = True, extra_info: Dict[str, Any] | None = None, verbose: bool = False) → ResultsSummary[source]

Summarize posterior results from a samples table into a ResultsSummary.

Parameters:

table (astropy.table.Table) – Input results table.
output_fits (str, optional) – If provided, writes a FITS file with summary products.
output_json (str, optional) – If provided, writes a JSON file with summary products.
parameter_prefix (str) – Delimiter/prefix used in parameter names (default: “–“).
posterior_key (str) – Name of the log-posterior column (default: “post”).
parameter_keys (list of str, optional) – Explicit list of parameter columns to use. If None, selects columns containing parameter_prefix.
percentiles (sequence of float) – Quantiles in [0,1].
hdi_mass (float) – Target mass for HDI intervals.
compute_1d / compute_2d (bool) – Enable 1D/2D PDF products.
parameter_key_pairs (list of (key1, key2), required if compute_2d=True)
pdf_bins_1d / pdf_bins_2d (int) – Number of bins/grid size for PDFs.
kde_1d / kde_2d (bool) – Prefer KDE; fallback to histogram if KDE fails.
extra_info (dict) – Arbitrary metadata included in exports.
verbose (bool) – Print progress.

Returns:

ResultsSummary

besta.postprocess.summarize_results_file(results_path: str, *, output_fits: str | None = None, output_json: str | None = None, delimiter: str = '\t', **kwargs) → ResultsSummary[source]: Read a results file and summarize it (passes kwargs to summarize_results).

besta.postprocess.compute_chain_percentiles(chain_results: Mapping[str, Any], *, pct: Sequence[float] = (0.16, 0.5, 0.84), weight_key: str = 'weight', parameter_prefix: str = '--') → Dict[str, ndarray][source]

Compute weighted quantiles for chain-like results dict.

Expects chain_results[param] arrays and chain_results[weight_key] weights.

besta.postprocess.make_plot_chains(chain_results: Mapping[str, Any], *, truth_values: Mapping[str, float] | None = None, weight_key: str = 'weight', parameter_prefix: str = '--', outdir: str | None = None, show: bool = False, dpi: int = 200) → List[str][source]

Plot simple trace + weighted histogram per parameter.

Parameters:

chain_results (dict-like) – Contains arrays for parameters and weight_key.
truth_values (dict, optional) – Mapping parameter name -> truth value.
outdir (str, optional) – If provided, saves PNGs to outdir and returns file paths. If None, returns empty list and only shows/creates figures.

besta.postprocess.weighted_quantiles(x: ndarray, w: ndarray, qs: Sequence[float]) → ndarray[source]: Compute weighted quantiles for a one-dimensional sample.

besta.postprocess.pdf_stats(edges: ndarray, pdf: ndarray, quantiles=None, find_multimodal: bool = False) → dict[source]

Compute summary stats from a discrete pdf over bin centers.

Parameters:

edges (ndarray, shape (K,)) – Bin edges.
pdf (ndarray, shape (K,)) – PDF defined by the edges.
quantiles (tuple of float, optional) – Quantiles to report (default 16, 50, 84 percent).
find_multimodal (bool, optional) – If True, return rough modes by local-maximum search.

Returns:

stats (dict) – Keys: mean, std, map, q, lo68, hi68, modes (optional).

besta.postprocess.pit_from_discrete_posterior(z_true: ndarray, posts: ndarray, z_edges: ndarray) → ndarray[source]

Probability Integral Transform for discrete posteriors on bins.

Assumes uniform density within each bin for within-bin interpolation.

Parameters:

z_true (ndarray, shape (N,)) – True values.
posts (ndarray, shape (N, K)) – Row-normalised posteriors over K bins.
z_edges (ndarray, shape (K+1,)) – Bin edges.

Returns:

pit (ndarray, shape (N,)) – PIT values in [0, 1].

besta.postprocess.photoz_metrics(z_true: ndarray, z_est: ndarray) → dict[source]

Standard photo-z metrics using delta z over 1+z.

Returns:: out (dict) – Keys: bias, nmad, outlier, rmse.

besta.postprocess.plot_chains(table, truth_values=None, output_dir=None, posterior_key='post')[source]

Make trace plots from an astropy Table containing chain results.

Parameters:

table (Table) – Astropy Table containing the chain results.
truth_values (list of float, optional) – True values for the parameters (used for plotting).
output_dir (str, optional) – Directory to save the output plots.
posterior_key (str, optional) – Key to use for the posterior samples in the table.

Returns:

all_figs (list of Figure) – List of all generated figures.