aimz.ImpactModel.predict#

ImpactModel.predict(X, *, intervention=None, rng_key=None, in_sample=True, return_sites=None, shard_axis='obs', batch_size=None, output_dir=None, progress=True, **kwargs)[source]#

Predict the output based on the fitted model.

This method performs posterior predictive sampling to generate model-based predictions. It is optimized for batch processing of large input data and is not recommended for use in loops that process only a few inputs at a time. Results are written to disk in the Zarr format, with sampling and file writing decoupled and executed concurrently.

Parameters:

X (ArrayLike | ArrayLoader) – Input data. If array-like, the leading axis is
Alternatively (the observation axis.)
array-like (a data loader that holds all)
internally. (objects and handles batching)
intervention (dict | None) – A dictionary mapping sample sites to their corresponding intervention values. Interventions enable counterfactual analysis by modifying the specified sample sites during prediction (posterior predictive sampling).
rng_key (Array | None) – A pseudo-random number generator key. By default, an internal key is used and split as needed.
in_sample (bool) – Specifies the group where posterior predictive samples are stored in the returned output. If True, samples are stored in the posterior_predictive group, indicating they were generated based on data used during model fitting. If False, samples are stored in the predictions group, indicating they were generated based on out-of-sample data.
return_sites (str | Iterable[str] | None) – Names of variables (sites) to return. If None, samples param_output and deterministic sites.
shard_axis (Literal['obs', 'draw']) – Multi-device sharding strategy; no effect on a single device. "obs" (default) shards the input across devices and replicates the posterior. "draw" shards the posterior across devices and replicates the input, which must be an array, not a data loader. If the model has no posterior samples, the data path is used regardless of shard_axis.
batch_size (int | None) – Size of each batch, taken from the input under shard_axis="obs" and from the draws under shard_axis="draw". Also used as the chunk size when storing results. If None, it is determined automatically from the input size and number of samples. Ignored if X is a data loader, in which case the data loader is expected to handle batching internally.
output_dir (str | Path | None) – The directory where the outputs will be saved. If the specified directory does not exist, it will be created automatically. If None, a model-owned temporary directory is used. A subdirectory is generated within this directory to store the outputs; its path is recorded in the returned tree’s artifact_path attribute (on the root and the group node). The temporary directory is removed by cleanup(), cleanup_models(), or when the model is garbage-collected. Pass an explicit output_dir to keep results beyond the model’s lifetime.
progress (bool) – Whether to display a progress bar.
**kwargs (object) – Additional arguments passed to the model.

Returns:

Posterior predictive samples. Posterior samples are included if available.

Raises:

TypeError – If param_output is passed as an argument, or shard_axis="draw" is used with a data loader X.
ValueError – If shard_axis is not "obs" or "draw".
NotImplementedError – If a return site’s axis-1 size does not match the input batch size (shard_axis="obs" only).

Return type:

xr.DataTree