Changelog#

All notable changes to this project will be documented in this file and are best viewed on the Changelog page.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

v0.13.0 - 2026-07-03#

Added#

predict(), log_likelihood(), sample_prior_predictive(), and sample_posterior_predictive() now accept a shard_axis argument selecting the multi-device sharding strategy. The default "obs" shards the observation axis of the input across devices and replicates the posterior (the previous behavior). The new "draw" shards the drawn samples across devices in chunks of batch_size draws while holding the whole input resident. This has no sharding effect on a single device (#224).

Changed#

predict(), log_likelihood(), and sample_prior_predictive() now place the conditioning samples on devices once instead of re-transferring and re-replicating them on every batch. For the data-parallel path of predict() and log_likelihood() the placed posterior is cached across calls (keyed by device sharding and rebuilt whenever the posterior is replaced); sample_prior_predictive() places its per-call prior samples once before the batch loop. This is a no-op on a single device and affects performance only, not results (#219).
When an observation-aligned posterior sample shape is detected as incompatible with the default shard_axis="obs", predict() now warns and reruns under the new shard_axis="draw" scheme instead of falling back to the in-memory predict_on_batch(), keeping results streamed to disk and memory-bounded. log_likelihood(), which previously failed with a raw shape-broadcasting error for such posteriors, behaves the same way (#224).
The auto-computed batch_size budget is now denominated in bytes rather than a fixed element count. The element cap is derived at call time by dividing the byte budget by the output dtype’s item size (#227).

Fixed#

Keyword arguments to the disk-backed methods whose values are strings or mappings are no longer mistaken for array inputs. Previously such a value was routed into the observation-axis batching pipeline and raised a misleading leading-axis error (#229).
Requesting an unknown return_sites name from predictive methods now raises a ValueError naming the unknown site(s), instead of silently returning an empty or partially populated result (#231).
estimate_effect() now raises a ValueError when the selected predictive group is missing from the baseline output, instead of a KeyError from the underlying subtraction (#233).
Interrupting a disk-backed run (predict(), log_likelihood(), sample_prior_predictive()) with Ctrl-C during the write phase no longer hangs; the interrupt now exits and the partially-written output directory is removed (#235).
On multi-device hosts, calling a disk-backed method again with a different number of array or keyword arguments no longer fails with a sharding structure error (#237).
predict() / predict_on_batch() interventions (and estimate_effect()) no longer reuse one likelihood-noise draw across all posterior draws, fixing under-dispersed intervention predictions and intervals (#239).
estimate_effect() now warns when the baseline and intervention scenarios have different dimension sizes, instead of silently computing the effect on their overlap (#241).
A background writer-thread failure while streaming results to disk now raises instead of being swallowed; previously disk-backed methods could return a silently zero-filled or truncated result (#243).
sample_prior_predictive() under the default shard_axis="obs" now forwards non-array keyword arguments to the model when drawing prior samples; previously they were dropped, so those samples used the kernel’s default values (silently wrong) or raised a TypeError for a required argument (#245).
The disk-backed methods now work with an array input when a custom param_input/param_output is set; previously the streamed dataset was keyed by the literal names X/y (#247).
A user-built ArrayLoader whose dataset carries array fields beyond the model input/output now works with the disk-backed methods, binding each field to the kernel parameter of the same name; previously such a loader failed during tracing with an error, and array arguments were matched by position rather than name (#249).
log_likelihood() now raises a ValueError when called with an array X but no y, instead of a KeyError: 'y' (#254).
The disk-backed methods now raise a clear TypeError when X is neither an array-like nor a data loader (e.g. a Python list), instead of an UnboundLocalError from an incomplete internal type check (#256).
A disk-backed method that fails before the write phase no longer leaves an orphaned subdirectory behind; the partially-created subdirectory is removed and the original error is re-raised (#258).
A model restored from disk with pickle/cloudpickle (e.g. through the mlflow integration) is now re-registered for class-level cleanup, so cleanup_models() removes its temporary directory; previously only directly constructed models were tracked (#260).
Saving an aimz model with an input example (via save_model(), log_model(), or autolog()) no longer bakes an ephemeral temporary output_dir and a fixed batch_size into the logged model signature; MLflow replayed those machine-local defaults on reload/serving, which broke prediction on a different machine and forced tiny batches (#262).
autolog() no longer crashes when a rng_key is passed to fit() / fit_on_batch() — the typed PRNG key was routed into the logged input example and raised, so MLflow silently skipped logging the model artifact. It also no longer leaks the observed label into the logged input example and signature when a custom param_output is used (#262).

v0.12.0 - 2026-05-23#

Changed#

Input arrays of any shape with at least one dimension are now accepted; the leading axis is treated as the sample axis. Previously, X was required to be 2D and y 1D, and y with shape (n, 1) triggered a DataConversionWarning from scikit-learn (#199).
log_likelihood() now evaluates the kernel directly at each posterior draw, mirroring the per-draw pattern used by predictive sampling. The seeded kernel is also constructed once before the per-batch loop, so each batch reuses the same cached compilation (#202).
Writer-thread queue sizing used when streaming batched outputs now adapts to available host memory and the per-batch output size (#208).
Disk-backed methods (predict(), sample_prior_predictive(), log_likelihood()) now preallocate each site’s Zarr array and write each batch into a fixed slice, replacing the previous per-batch append. This avoids repeated Zarr resizing and is faster when the batch size is small and the number of batches is large. As a consequence, every return site must emit an axis-1 size equal to the input batch size; kernels with incompatible return sites raise NotImplementedError (#213).

Fixed#

Writer-thread startup errors while opening Zarr output groups are now reported through the existing writer error path and queued items are drained before shutdown, preventing the main thread from waiting indefinitely when a background writer fails before consuming its queue (#210).

Removed#

The scikit-learn dependency (#199).

v0.11.0 - 2026-04-29#

Changed#

Removed package-level logging configuration from aimz/__init__.py. aimz no longer sets a log level, attaches a StreamHandler(sys.stdout), or calls logging.captureWarnings(True) on import; the aimz logger now only has a logging.NullHandler() attached. Configuring handlers, levels, and warnings capture is the responsibility of the application. Log messages emitted by ImpactModel were also refined—trailing ellipses were removed and posterior sampling now reports the number of samples being drawn—and the output-directory cleanup notice raised when predict_on_batch() and log_likelihood() encounter an error is now logged at the warning level (previously debug) (#192).

Fixed#

Fixed sample_prior_predictive() failing on multi-device meshes with ValueError: in_specs ... does not match the specs of the input ... @obs. The probe batch used to trace the kernel and draw global prior samples is now built with batch_size=1 and device=None, preventing JAX’s sharding-in-types from propagating the obs mesh axis onto global (non-batched) sample sites that are later passed as the replicated samples argument to the jax.shard_map() sampler (#194).

v0.10.0 - 2026-04-17#

Added#

Added support for Python 3.14.
estimate_effect() now accepts an on_batch keyword argument. When True, predictions are dispatched through predict_on_batch() and any raw dict results are automatically converted to xarray.DataTree (#180).

Changed#

ArrayDataset now employs NumPy-based indexing in ArrayLoader instead of triggering JAX tracing on each batch (#168).
Changed the default value of to_jax in ArrayDataset from True to False to avoid redundant conversion (#170).

Fixed#

Fixed auto-computed batch_size rounding down to zero on multi-device setups when MAX_ELEMENTS // num_samples is smaller than the number of devices (#172).
pad_array() now pads with NumPy when given NumPy arrays, avoiding premature device transfers, and skips padding entirely when n_pad is zero (#174).

v0.9.1 - 2025-12-08#

Fixed#

Fixed jax.shard_map() closure error for sharded rng_key in parallelism methods when using JAX 0.8 and newer versions (#140).

v0.9.0 - 2025-11-16#

Added#

Added the class method cleanup_models() to clean up temporary directories for all active model instances (#136).

Changed#

The output subdirectory naming convention has changed from using only a timestamp to the pattern <timestamp>_<caller_name>/, where <caller_name> is the name of the method that triggered the write operation (#138).
Lowered the logging level for exceptions during temporary directory cleanup from exception to debug to reduce console noise.

v0.8.1 - 2025-10-23#

Changed#

The minimum required versions are: Dask 2025.7, JAX 0.8, and Xarray 2025.7.
Replaced deprecated jax.experimental.shard_map.shard_map with jax.shard_map() to ensure compatibility with JAX 0.8 and newer versions (#128).
Logging exception messages are displayed before the writer thread is shut down, providing a more immediate response for predict() and log_likelihood(), especially when interrupted by the keyboard (#130).

v0.8.0 - 2025-10-14#

Added#

Extended MLflow autologging to support the fit_on_batch() method (#119).
Added str and repr methods to the {class}~aimz.ImpactModel (#118).
KernelSpec now includes a sample_sites attribute listing all stochastic sample sites in the model kernel (#125).

v0.7.0 - 2025-09-29#

Added#

output_dir attribute to the root and group nodes of xarray.DataTree objects returned by sample_prior_predictive(), predict(), and log_likelihood(), specifying the directory where results are saved (#85).
Introduced the public KernelSpec dataclass and the kernel_spec attribute on ImpactModel. This exposes a lazily-built, cached structural specification of the user kernel (fields: traced, return_sites, output_observed) so training and predictive methods avoid redundant model tracing (#98).
When available, an output_dir attribute is added to the root node of xarray.DataTree object returned by estimate_effect(), specifying the directory where results are saved (#110).

Changed#

All tqdm progress bars now use dynamic_ncols=True to adjust column width dynamically (#93).
fit_on_batch(), sample_prior_predictive_on_batch(), sample_prior_predictive(), and train_on_batch() now reuse the cached kernel_spec and avoid redundant model tracing (#98).
set_posterior_sample() no longer accepts a return_sites parameter; downstream methods can now set it explicitly (#100).
set_posterior_sample() now raises an error when an empty posterior dictionary ({}) is provided (#101).
sample_prior_predictive_on_batch() and sample_prior_predictive() now include posterior samples in the returned results if available (#103).
sample_prior_predictive_on_batch(), sample_prior_predictive(), sample(), sample_posterior_predictive_on_batch(), sample_posterior_predictive(), predict_on_batch(), and predict() can now accept a single str or an iterable of str values for the return_sites parameter (#107).
sample_prior_predictive_on_batch() returns the default output site along with deterministic sites when return_sites is not specified, to be consistent with the behavior of other sampling methods (#108).
estimate_effect() returns a posterior group node in the xarray.DataTree object when posterior samples are available, to be consistent with other methods (#110).
Subdirectories under temp_dir now include microseconds in their names to avoid duplicates and file-exists errors (#110).

Fixed#

Methods in ImpactModel no longer include an empty posterior data variable in root node of the returned xarray.DataTree when no posterior samples are available (#91).

v0.6.0 - 2025-09-14#

Added#

sample_prior_predictive_on_batch(), sample(), sample_posterior_predictive_on_batch(), and predict_on_batch() methods in ImpactModel now support a return_datatree parameter. When set to True (by default), results are returned as an xarray.DataTree; otherwise, a dict is returned (#74).
MLflow integration for ImpactModel (#71).

Changed#

Methods in ImpactModel now automatically determine the batch_size if it is not provided, based on the input data and number of samples (#70).
sample_posterior_predictive_on_batch() and sample_posterior_predictive() no longer accept the in_sample argument. Results are now always written to the posterior_predictive group.

Removed#

Removed the tqdm dependency (#80).

Fixed#

Methods in ImpactModel now handle empty posterior dictionaries ({}) gracefully instead of failing when no posterior samples are available (#76).

v0.5.0 - 2025-09-01#

Added#

Added a return_sites parameter to the predict() and predict_on_batch() methods in ImpactModel, allowing users to specify which sites to include in the output (#55).
sample_prior_predictive_on_batch(), replacing sample_prior_predictive() (#67).
sample_posterior_predictive_on_batch(), replacing sample_posterior_predictive() (#67).

Changed#

Switched documentation build system from MkDocs to Sphinx and ReadTheDocs (https://aimz.readthedocs.io).
Added input X validation to sample_prior_predictive() (#65).
Exposed ImpactModel at the top-level package, allowing from aimz import ImpactModel (#67).
sample_prior_predictive() now returns a xarray.DataTree instead of a dictionary, and writes outputs to files like the other methods (#67).
sample_posterior_predictive() is now an alias of predict() and returns a xarray.DataTree (#67).

Fixed#

Enhanced data array validation to preserve device placement for JAX arrays (#53).
Fixed incompatibility with Zarr when models output arrays in bfloat16 by automatically promoting them to float32 before saving (#57).
Fixed the error message in sample_posterior_predictive() when self.param_output is passed as an argument, which previously incorrectly referenced sample_prior_predictive() (#65).

v0.4.0 - 2025-08-18#

Added#

Support for NumPyro MCMC in ImpactModel, including fit_on_batch(), sample(), and set_posterior_sample() methods (#35).

Changed#

ImpactModel methods predict(), predict_on_batch(), log_likelihood(), and estimate_effect() now return outputs as xarray DataTree instead of ArviZ InferenceData. Dimension names now follow the dim_N convention instead of the previous dimN style (#49).
fit(), fit_on_batch(), and train_on_batch() methods in ImpactModel now check for "/" in kernel site names to ensure compatibility with xarray DataTree (#49).

Removed#

Removed the arviz dependency (#49).

Fixed#

predict() in ImpactModel now checks for available posterior samples before falling back to predict_on_batch().
ArrayLoader validates that batch_size is a positive integer.

v0.3.2 - 2025-08-13#

Changed#

Updated predict() and predict_on_batch() to check for available posterior samples before returning outputs. This prevents errors when posterior samples are not defined based on the model specification.

v0.3.1 - 2025-08-02#

Fixed#

ArrayDataset and ArrayLoader now preserve the order in which input arrays are provided, ensuring consistent input mapping in methods like predict() and log_likelihood() (#43).

v0.3.0 - 2025-07-18#

Changed#

ImpactModel initialization parameter vi has been renamed to inference for compatibility with MCMC in future releases (#36).
ImpactModel now supports ArrayLoader for both input and output data (#24).
Renamed the posterior sample attribute of ImpactModel from posterior_samples_ to posterior, which is now initialized to None (#25).
ArrayLoader and ArrayDataset no longer require the torch dependency. ArrayDataset now accepts only named arrays, and ArrayLoader yields tuples of a dictionary and a padding integer (#26).

Removed#

Removed the torch dependency (#26).

v0.2.0 - 2025-07-10#

Added#

train_on_batch() and fit_on_batch() methods to ImpactModel (#15).
Custom ArrayDataset class for handling data in ImpactModel, removing the need for the jax-dataloader dependency (#14).
GitHub Pages documentation site (#10).
Installation instructions in the documentation site (#10).
ArrayLoader class supports shuffle parameter for epoch training for fit() (#15).

Changed#

Adopted jax.typing module for improved type hints.
Removed unnecessary JAX array type conversion in ImpactModel methods.
The fit() method now uses epoch-based (minibatch) training (#15).
Updated fit(), train_on_batch(), and fit_on_batch() to train the model using the internal SVI state, continuing from the last state if available (#15).

Removed#

Removed the jax-dataloader dependency (#14).
Removed the guide property, as it is part of the vi property.

v0.1.0 - 2025-06-27#

Added#

Initial public release.