🧪 API Reference¶

Welcome to the climatrix API reference. Below you'll find details on key modules, classes, and methods — with examples and usage tips to help you integrate it smoothly into your climate data workflows.

Abstract

The main module climatrix provides tools to extend xarray datasets for climate subsetting, sampling, reconstruction. It is accessible via accessor.

The library contains a few public classes:

Class name	Description
`AxisType`	Enumerator class for type of spatio-temporal axes
`Axis`	Class managing spatio-temporal axes
`BaseClimatrixDataset`	Base class for managing `xarray` data
`Domain`	Base class for domain-specific operations
`SparseDomain`	Subclass of `Domain` aim at managing sparse representations
`DenseDomain`	Subclass of `Domain` aim at managing dense representations
`Plot`	Interactive plotting utility for climate datasets

📈 Axes¶

`climatrix.dataset.axis.AxisType` ¶

Bases: StrEnum

Enum for axis types.

Attributes:

Name	Type	Description
`LATITUDE`	`str`	Latitude axis type.
`LONGITUDE`	`str`	Longitude axis type.
`TIME`	`str`	Time axis type.
`VERTICAL`	`str`	Vertical axis type.
`POINT`	`str`	Point axis type.

`get(value)` `classmethod` ¶

Get the AxisType type given by value.

If value is an instance of AxisType, return it as is. If value is a string, return the corresponding AxisType. If value is neither an instance of AxisType nor a string, raise a ValueError.

Parameters:

Name	Type	Description	Default
`value`	`str or AxisType`	The axis type	required

Returns:

Type	Description
`AxisType`	The axis type.

Raises:

Type	Description
`ValueError`	If `value` is not a valid axis type.

`climatrix.dataset.axis.Axis` ¶

Base class for axis types.

Attributes:

Name	Type	Description
`type`	`ClassVar[AxisType]`	The type of the axis.
`dtype`	`ClassVar[dtype]`	The data type of the axis values.
`is_dimension`	`bool`	Whether the axis is a dimension or not.
`name`	`str`	The name of the axis.
`values`	`ndarray`	The values of the axis.

Parameters:

Name	Type	Description	Default
`name`	`str`	The name of the axis.	required
`values`	`ndarray`	The values of the axis.	required
`is_dimension`	`bool`	Whether the axis is a dimension or not (default is True).	`True`

Examples:

Axis is a factory class for all axis types. To create an axis (by matching the name), use:

>>> axis = Axis(name="latitude", values=np.array([1, 2, 3]))

To create a Latitude axis explicitly, use:

>>> axis = Latitude(name="latitude", values=np.array([1, 2, 3]))
>>> axis = Latitude(
... name="latitude",
... values=np.array([1, 2, 3]),
... is_dimension=True)

Notes

The Axis class is a factory class for all axis types.
If the given axis has "unusual" name, you need to create it explicitly using the corresponding class (e.g. Latitude).

`size` `property` ¶

Get the size of the axis.

Returns:

Type	Description
`int`	The size of the axis.

`matches(name)` `classmethod` ¶

Check if the axis matches the given name.

Parameters:

Name	Type	Description	Default
`name`	`str`	The name to check.	required

Returns:

Type	Description
`bool`	True if the axis matches the name, False otherwise.

`get_all_axes()` `classmethod` ¶

Get all axis classes.

Returns:

Type	Description
`list[Type[Axis]]`	A list of all axis classes.

`climatrix.dataset.axis.Latitude` ¶

Bases: Axis

Latitude axis.

Attributes:

Name	Type	Description
`name`	`str`	The name of the latitude axis.
`is_dimension`	`bool`	Whether the axis is a dimension or not.

`climatrix.dataset.axis.Longitude` ¶

Bases: Axis

Longitude axis.

Attributes:

Name	Type	Description
`name`	`str`	The name of the longitude axis.
`is_dimension`	`bool`	Whether the axis is a dimension or not.

`climatrix.dataset.axis.Time` ¶

Bases: Axis

Time axis.

Attributes:

Name	Type	Description
`name`	`str`	The name of the time axis.
`is_dimension`	`bool`	Whether the axis is a dimension or not.

`eq(other)` ¶

Check if two axes are equal.

Parameters:

Name	Type	Description	Default
`other`	`object`	The other object to compare with.	required

Returns:

Type	Description
`bool`	True if the axes are equal, False otherwise.

`climatrix.dataset.axis.Point` ¶

Bases: Axis

Point axis.

Attributes:

Name	Type	Description
`name`	`str`	The name of the point axis.
`is_dimension`	`bool`	Whether the axis is a dimension or not.

`climatrix.dataset.axis.Vertical` ¶

Bases: Axis

Vertical axis.

Attributes:

Name	Type	Description
`name`	`str`	The name of the vertical axis.
`is_dimension`	`bool`	Whether the axis is a dimension or not.

📇 Data¶

`climatrix.dataset.base.BaseClimatrixDataset` ¶

Base class for Climatrix workflows.

This class provides a set of methods for manipulating xarray datasets. It is designed to be used as an xarray accessor, allowing you to call its methods directly on xarray datasets.

The class supports basic arithmetic operations, including: addition, subtraction, multiplication, and division.

Attributes:

Name	Type	Description
`da`	`DataArray`	The underlying `xarray.DataArray` object (if single-variable `xarray.Dataset` was passed, it is squeezed to `xarray.DataArray`).
`domain`	`Domain`	The domain object representing the spatial and temporal dimensions of the dataset. See `SparseDomain` and `DenseDomain` for more details.

`domain = Domain(xarray_obj)` `instance-attribute` ¶

`subset(north=None, south=None, west=None, east=None)` ¶

Subset data with the specified bounding box.

If an argument is not provided, it means no bounds set in that direction. For example, if north is not provided, it means that the maximum latitude of the dataset will be used. If north and south are provided, the dataset will be subsetted to the area between these two latitudes.

Parameters:

Name	Type	Description	Default
`north`	`float`	North latitude of the bounding box.	`None`
`south`	`float`	South latitude of the bounding box.	`None`
`west`	`float`	West longitude of the bounding box.	`None`
`east`	`float`	East longitude of the bounding box.	`None`

Returns:

Type	Description
`Self`	The subsetted dataset.

Raises:

Type	Description
`LongitudeConventionMismatch`	If the dataset is in positive-only convention (longitude \(\lambda \in [0, 360]\)) and negative values are requested, or vice versa. If the dataset is in signed-longitude convention (longitude \(\lambda \in [-180, 180]\)) and positive values greater than 360 are requested.

Examples:

>>> import climatrix as cm
>>> globe_dset = xr.open_dataset("path/to/dataset.nc")
>>> globe_dset
<xarray.Dataset>
Dimensions:  (time: 1, latitude: 180, longitude: 360)
Coordinates:
  * time     (time) datetime64[ns] 2020-01-01
  * latitude (latitude) float64 -90.0 -89.0 -88.0 ... 88.0 89.0
  * longitude (longitude) float64 0.0 1.0 2.0 ... 357.0 358.0 359.0
Data variables:
    temperature (time, latitude, longitude) float64 ...
>>> dset2 = globe_dset.cm.subset(
...     north=10.0,
...     south=5.0,
...     west=20.0,
...     east=25.0,
... )
>>> dset2 = globe_dset.cm.subset(
...     north=10.0,
...     south=5.0,
...     west=-50.0,
...     east=25.0,
... )
LongitudeConventionMismatch: The dataset is in positive-only convention
(longitude goes from 0 to 360) while you are
requesting negative values (longitude goes from -180 to 180).

`to_signed_longitude()` ¶

Convert the dataset to signed longitude convention.

The longitude values are converted to be in the range (-180 to 180 degrees).

Examples:

>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc").cm
>>> dset.da
<xarray.DataArray 'temperature' (time: 1, latitude: 180, longitude: 360)>
...
Dimensions:  (time: 1, latitude: 180, longitude: 360)
Coordinates:
  * time     (time) datetime64[ns] 2020-01-01
  * latitude (latitude) float64 -90.0 -89.0 -88.0 ... 88.0 89.0
  * longitude (longitude) float64 0.0 1.0 2.0 ... 357.0 358.0 359.0
Data variables:
    temperature (time, latitude, longitude) float64 ...
>>> dset2 = cm.to_signed_longitude()
>>> dset2.da
<xarray.DataArray 'temperature' (time: 1, latitude: 180, longitude: 360)>
...
Dimensions:  (time: 1, latitude: 180, longitude: 360)
Coordinates:
  * time     (time) datetime64[ns] 2020-01-01
  * latitude (latitude) float64 -90.0 -89.0 -88.0 ... 88.0 89.0
  * longitude (longitude) float64 -180.0 -179.0 -178.0 ... 177.0 178.0 179.0

References

[1] Mancini, M., Walczak, J. Stojiljkovic, M., geokube: A Python library for geospatial data processing, 2024, https://doi.org/10.5281/zenodo.10597965 https://github.com/CMCC-Foundation/geokube

`to_positive_longitude()` ¶

Convert the dataset to positive longitude convention.

The longitude values are converted to be in the range (0 to 360 degrees).

Examples:

>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc").cm
>>> dset.da
<xarray.DataArray 'temperature' (time: 1, latitude: 180,
longitude: 360)>
...
Dimensions:  (time: 1, latitude: 180, longitude: 360)
Coordinates:
  * time     (time) datetime64[ns] 2020-01-01
  * latitude (latitude) float64 -90.0 -89.0 -88.0 ... 88.0 89.0
  * longitude (longitude) float64 -180.0 ... 178.0 179.0
Data variables:
    temperature (time, latitude, longitude) float64 ...
>>> dset2 = dset.to_positive_longitude()
>>> dset2.da
<xarray.DataArray 'temperature' (time: 1, latitude: 180,
  longitude: 360)>
...
Dimensions:  (time: 1, latitude: 180, longitude: 360)
Coordinates:
  * time     (time) datetime64[ns] 2020-01-01
  * latitude (latitude) float64 -90.0 -89.0 -88.0 ... 88.0 89.0
  * longitude (longitude) float64 0.0 1.0 ... 357.0 358.0 359.0
Data variables:
    temperature (time, latitude, longitude) float64 ...

References

[1] Mancini, M., Walczak, J. Stojiljkovic, M., geokube: A Python library for geospatial data processing, 2024, https://doi.org/10.5281/zenodo.10597965 https://github.com/CMCC-Foundation/geokube

`squeeze()` ¶

Squeeze the dataset to remove dimensions of size 1.

Returns:

Type	Description
`Self`	The squeezed dataset.

`profile_along_axes(*axes)` ¶

Generate profiles along the specified axes.

Parameters:

Name	Type	Description	Default
`*axes`	`AxisType \| str`	The axes along which to generate profiles.	`()`

Yields:

Type	Description
`BaseClimatrixDataset`	A dataset containing the profile along the specified axes.

`mask_nan(source)` ¶

Apply NaN values from another dataset to the current one.

Parameters:

Name	Type	Description	Default
`source`	`BaseClimatrixDataset`	Dataset whose NaN values will be applied to the current one.	required

Returns:

Type	Description
`BaseClimatrixDataset`	A new dataset with NaN values applied.

Raises:

Type	Description
`TypeError`	If the `source` argument is not a BaseClimatrixDataset.
`ValueError`	If the domain of the `source` or the current dataset is sparse.
`DomainMismatchError`	If the domains of the `source` and the current dataset do not match.

Examples:

>>> import climatrix as cm
>>> dset1 = xr.open_dataset("path/to/dataset1.nc").cm
>>> dset2 = xr.open_dataset("path/to/dataset2.nc").cm
>>> dset1.mask_nan(dset2)

`time(time)` ¶

Select data at a specific time or times.

Parameters:

Name	Type	Description	Default
`time`	`datetime, np.datetime64, slice, list, or np.ndarray`	Time or times to be selected.	required

Returns:

Type	Description
`Self`	The dataset with the selected time or times.

Examples:

>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc")

Selecting by datetime object:

>>> dset.cm.time(datetime(2020, 1, 1))

Selecting by np.datetime64 object:

>>> dset.cm.time(np.datetime64("2020-01-01"))

Selecting by str object:

>>> dset.cm.time(slice("2020-01-01"))

Selecting by list of any of the above:

>>> dset.cm.time([datetime(2020, 1, 1), np.datetime64("2020-01-02")])

Selecting by slice object:

>>> dset.cm.time(slice(datetime(2020, 1, 1), datetime(2020, 1, 2)))

`itime(time)` ¶

Select time value by index.

Parameters:

Name	Type	Description	Default
`time`	`int, list[int], np.ndarray, or slice`	Time index or indices to be selected.	required

Returns:

Type	Description
`Self`	The dataset with the selected time or times.

Examples:

>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc")

Selecting by int object:

>>> dset.cm.itime(0)

Selecting by list of ints:

>>> dset.cm.itime([0, 1])

Selecting by slice object:

>>> dset.cm.itime(slice(0, 2))

`sample_uniform(portion=None, number=None, nan='ignore')` ¶

Sample the dataset using a uniform distribution.

Parameters:

Name	Type	Description	Default
`portion`	`float`	Portion of the dataset to be sampled.	`None`
`number`	`int`	Number of points to be sampled.	`None`
`nan`	`SamplingNaNPolicy \| str`	Policy for handling NaN values.	`'ignore'`

Notes

At least one of portion or number must be provided. Cannot be provided both at the same time.

Warns:

Type	Description
`TooLargeSamplePortionWarning`	If the portion exceeds 1.0 or number of points exceeds the number of spatial points in the Domain

Raises:

Type	Description
`ValueError`	If the dataset contains NaN values and `nan` parameter (NaN handling policy) is set to `SamplingNaNPolicy.RAISE`.

Examples:

>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc")
>>> sparse_dset = dset.cm.sample_uniform(portion=0.1)

`sample_normal(portion=None, number=None, center_point=None, sigma=10.0, nan='ignore')` ¶

Sample the dataset using a normal distribution.

Parameters:

Name	Type	Description	Default
`portion`	`float`	Portion of the dataset to be sampled.	`None`
`number`	`int`	Number of points to be sampled.	`None`
`center_point`	`tuple[Longitude, Latitude]`	Center point for the normal distribution.	`None`
`sigma`	`float`	Standard deviation for the normal distribution.	`10.0`
`nan`	`SamplingNaNPolicy \| str`	Policy for handling NaN values.	`'ignore'`

Notes

At least one of portion or number must be provided. Cannot be provided both at the same time.

Warns:

Type	Description
`TooLargeSamplePortionWarning`	If the portion exceeds 1.0 or number of points exceeds the number of spatial points in the Domain

Raises:

Type	Description
`ValueError`	If the dataset contains NaN values and `nan` parameter (NaN handling policy) is set to `SamplingNaNPolicy.RAISE`.

Examples:

>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc")
>>> sparse_dset = dset.cm.sample_normal(
...     number=1_000,
...     center_point=(10.0, 20.0),
...     sigma=5.0,
... )

`reconstruct(target, *, method, **recon_kwargs)` ¶

Reconstruct the dataset to a target domain.

If target domain is sparse, the reconstruction will be sparse too. If target domain is dense, the reconstruction will be dense too. The reconstruction will be done using the method specified in the method argument.

The method can be one of the following: Inverse Distance Weightining (idw), Ordinary Kriging (ok).

Parameters:

Name	Type	Description	Default
`target`	`Domain`	The target domain to reconstruct the dataset to.	required
`method`	`ReconstructionType \| str`	The method to use for reconstruction. Can be one of the following: 'idw', 'ok'.	required
`recon_kwargs`	`dict`	Additional keyword arguments to pass to the reconstruction method.	`{}`

`plot(title=None, target=None, show=True, **kwargs)` ¶

Plot the dataset on a map.

The dataset is plotted using Cartopy and Matplotlib.

Parameters:

Name	Type	Description	Default
`title`	`str`	Title of the plot. If not provided, the name of the dataset will be used. If the dataset has no name, "Climatrix Dataset" will be used.	`None`
`target`	`str, os.PathLike, Path, or None`	Path to save the plot. If not provided, the plot will not be saved.	`None`
`show`	`bool`	Whether to show the plot. Default is True.	`True`
`**kwargs`	`dict`	Additional keyword arguments to pass to the plotting function. `figsize`: tuple, optional Size of the figure. Default is (12, 6). `vmin`: float, optional Minimum value for the color scale. Default is None. `vmax`: float, optional Maximum value for the color scale. Default is None. `cmap`: str, optional Colormap to use for the plot. Default is "seismic". `ax`: Axes, optional Axes to plot on. If not provided, a new figure and axes will be created. `size`: int, optional Size of the points for sparse datasets. Default is 10.	`{}`

Returns:

Type	Description
`Axes`	The axes object containing the plot.

Raises:

Type	Description
`NotImplementedError`	If the dataset is dynamic (contains time dimension with more than one value).

`transpose(*axes)` ¶

Transpose the dataset along the specified dimensions.

Parameters:

Name	Type	Description	Default
`*axes`	`AxisType or str`	The axes along which to transpose the dataset.	`()`

Returns:

Type	Description
`Self`	The transposed dataset.

Examples:

>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc").cm
>>> dset2 = dset.transpose("longitude", "latitude")

🌍 Domain¶

`climatrix.dataset.domain.Domain` ¶

Base class for domain objects.

Attributes:

Name	Type	Description
`is_sparse`	`ClassVar[bool]`	Indicates if the domain is sparse or dense.
`_axes`	`dict[AxisType, Axis]`	Mapping of `AxisType` to the corresponding `Axis` object.

`dims` `property` ¶

Get the dimensions of the dataset.

Returns:

Type	Description
`tuple[AxisType, ...]`	A tuple of `AxisType` objects representing the dimensions of the dataset.

Notes

The dimensions are determined by the axes that are marked as dimensions in the domain. E.g. if underlying dataset has shape (5, 10, 20), it means there are 3 dimensional axes.

`latitude` `property` ¶

Latitude axis

`longitude` `property` ¶

Longitude axis

`time` `property` ¶

Time axis

`point` `property` ¶

Point axis

`vertical` `property` ¶

Vertical axis

`is_dynamic` `property` ¶

If the domain is dynamic.

`is_sparse` `class-attribute` ¶

`size` `property` ¶

Domain size.

`all_axes_types` `property` ¶

All axis types in the domain.

`from_lat_lon(lat=slice(-90, 90, _DEFAULT_LAT_RESOLUTION), lon=slice(-180, 180, _DEFAULT_LON_RESOLUTION), kind='dense')` `classmethod` ¶

Create a domain from latitude and longitude coordinates.

Parameters:

Name	Type	Description	Default
`lat`	`slice or ndarray`	Latitude coordinates. If a slice is provided, it will be converted to a numpy array using the specified step.	`slice(-90, 90, _DEFAULT_LAT_RESOLUTION)`
`lon`	`slice or ndarray`	Longitude coordinates. If a slice is provided, it will be converted to a numpy array using the specified step.	`slice(-180, 180, _DEFAULT_LON_RESOLUTION)`
`kind`	`str`	Type of domain to create. Can be either "dense" or "sparse". Default is "dense".	`'dense'`

Returns:

Type	Description
`Domain`	An instance of the Domain class with the specified latitude and longitude coordinates.

`from_axes()` `classmethod` ¶

Create a domain builder for configuring domains with multiple axes.

Returns:

Type	Description
`DomainBuilder`	A builder instance for creating domains with various axes.

Examples:

>>> domain = (Domain.from_axes()
...           .vertical(depth=slice(10, 100, 1))
...           .lat(latitude=[1,2,3,4])
...           .lon(longitude=[1,2,3,4])
...           .sparse())
>>> domain = (Domain.from_axes()
...           .lat(lat=slice(-90, 90, 1))
...           .lon(lon=slice(-180, 180, 1))
...           .time(time=['2020-01-01', '2020-01-02'])
...           .dense())

`get_size(axis)` ¶

Get the size of the specified axis.

Parameters:

Name	Type	Description	Default
`axis`	`AxisType`	The axis for which to get the size.	required

Returns:

Type	Description
`int`	The size of the specified axis.

`has_axis(axis)` ¶

Check if the specified axis exists in the domain.

Parameters:

Name	Type	Description	Default
`axis`	`AxisType`	The axis type to check.	required

Returns:

Type	Description
`bool`	True if the axis exists, False otherwise.

`get_axis(axis)` ¶

Get the name of the specified axis.

Parameters:

Name	Type	Description	Default
`axis`	`AxisType`	The axis type for which to get the name.	required

Returns:

Type	Description
`Axis \| None`	The Axis object, or None if not found.

`get_all_spatial_points()` `abstractmethod` ¶

`to_xarray(values, name=None)` `abstractmethod` ¶

`climatrix.dataset.domain.SparseDomain` ¶

Bases: Domain

Sparse domain class.

Supports operations on sparse spatial domain.

`to_xarray(values, name=None)` ¶

Convert domain to sparse xarray.DataArray.

The method applies values and (optionally) name to create a new xarray.DataArray object based on the domain.

Parameters:

Name	Type	Description	Default
`values`	`ndarray`	The values to be assigned to the DataArray variable.	required
`name`	`str`	The name of the DataArray variable.	`None`

Returns:

Type	Description
`DataArray`	The xarray.DataArray single variable object.

Raises:

Type	Description
`ValueError`	If the shape of `values` does not match the expected shape.

Examples:

>>> domain = Domain.from_lat_lon()
>>> values = np.random.rand(5, 5)
>>> da = domain.to_xarray(values, name="example")
>>> isinstance(da, xr.DataArray)
True
>>> da.name
'example'

`get_all_spatial_points()` ¶

Get all spatial points in the domain.

Returns:

Type	Description
`ndarray`	An array of shape (n_points, 2) containing the latitude and longitude coordinates of all points in the domain.

Examples:

>>> points = domain.get_all_spatial_points()
>>> points
array([[ 0. , -0.1],
       [ 0. ,  0. ],
       [ 0. ,  0.1],
       ...

`climatrix.dataset.domain.DenseDomain` ¶

Bases: Domain

Dense domain class.

Supports operations on dense spatial domain.

`to_xarray(values, name=None)` ¶

Convert domain to dense xarray.DataArray.

The method applies values and (optionally) name to create a new xarray.DataArray object based on the domain.

Parameters:

Name	Type	Description	Default
`values`	`ndarray`	The values to be assigned to the DataArray variable.	required
`name`	`str`	The name of the DataArray variable.	`None`

Returns:

Type	Description
`DataArray`	The xarray.DataArray single variable object.

Raises:

Type	Description
`ValueError`	If the shape of `values` does not match the expected shape.

Examples:

>>> domain = Domain.from_lat_lon()
>>> values = np.random.rand(5, 5)
>>> da = domain.to_xarray(values, name="example")
>>> isinstance(da, xr.DataArray)
True
>>> da.name
'example'

`get_all_spatial_points()` ¶

Get all spatial points in the domain.

Returns:

Type	Description
`ndarray`	An array of shape (n_points, 2) containing the latitude and longitude coordinates of all points in the domain.

Examples:

>>> points = domain.get_all_spatial_points()
>>> points
array([[ 0. , -0.1],
       [ 0. ,  0. ],
       [ 0. ,  0.1],
       ...

📈 Interactive Plotting¶

`climatrix.plot.core.Plot` ¶

`show(port=5000, debug=False)` ¶

🌐 Reconstructors¶

`climatrix.reconstruct.base.BaseReconstructor` ¶

Bases: ABC

Base class for all dataset reconstruction methods.

Attributes:

Name	Type	Description
`dataset`	`BaseClimatrixDataset`	The dataset to be reconstructed.
`target_domain`	`Domain`	The target domain for the reconstruction.

`__init_subclass__(**kwargs)` ¶

Register subclasses automatically.

`get(method)` `classmethod` ¶

Get a reconstruction class by method name.

Parameters:

Name	Type	Description	Default
`method`	`str`	The reconstruction method name (e.g., 'idw', 'ok', 'sinet', 'siren').	required

Returns:

Type	Description
`type[BaseReconstructor]`	The reconstruction class.

Raises:

Type	Description
`ValueError`	If the method is not supported.

Notes

The method parameter should reflect the NAME class attribute of the selected reconstructor class.

`get_available_methods()` `classmethod` ¶

Get a list of available reconstruction methods.

Returns:

Type	Description
`list[str]`	List of method names (e.g., 'idw', 'ok', 'sinet', 'siren').

`get_hparams()` `classmethod` ¶

Get hyperparameter definitions from Hyperparameter descriptors.

Returns:

Type	Description
`dict[str, dict[str, any]]`	Dictionary mapping parameter names to their definitions. Each parameter definition contains: - 'type': the parameter type - 'bounds': tuple of (min, max) for numeric parameters (if defined) - 'values': list of valid values for categorical parameters (if defined) - 'default': default value (if defined)

`reconstruct()` `abstractmethod` ¶

Reconstruct the dataset using the specified method.

This is an abstract method that must be implemented by subclasses.

The data are reconstructed for the target domain, passed in the initializer.

Returns:

Type	Description
`BaseClimatrixDataset`	The reconstructed dataset.

`update_bounds(bounds=None, values=None)` `classmethod` ¶

Update the bounds of hyperparameters in the class.

If bound is defined as tuple, it represents a range (min, max). If as a list, it represents a set of valid values.

Parameters:

Name	Type	Description	Default
`**bounds`	`dict[str, tuple]`	Keyword arguments where keys are hyperparameter names and values are tuples defining new bounds.	`None`

`climatrix.reconstruct.idw.IDWReconstructor` ¶

Bases: BaseReconstructor

Inverse Distance Weighting Reconstructor

This class performs spatial interpolation using inverse distance weighting, where the influence of each known data point on the interpolated value decreases with distance according to a power function.

Parameters:

Name	Type	Description	Default
`dataset`	`BaseClimatrixDataset`	The input dataset to reconstruct.	required
`target_domain`	`Domain`	The target domain for reconstruction.	required
`power`	`float`	The power to raise the distance to (default is 2.0). Controls the rate of decrease of influence with distance. Type: float, bounds: , default: 2.0	`None`
`k`	`int`	The number of nearest neighbors to consider (default is 5). Type: int, bounds: (1, ...), default: 5	`None`
`k_min`	`int`	The minimum number of nearest neighbors to consider (if k < k_min) NaN values will be put (default is 2). Type: int, bounds: (1, ...)>, default: 2	`None`

Raises:

Type	Description
`NotImplementedError`	If the input dataset is dynamic, as IDW reconstruction is not yet supported for dynamic datasets.
`ValueError`	If k_min is greater than k or if k is less than 1.

Notes

Hyperparameters for optimization: - power: float in (1e-10, 5.0), default=2.0 - k: int in (1, 50), default=5 - k_min: int in (1, 40), default=2

`reconstruct()` ¶

Perform Inverse Distance Weighting (IDW) reconstruction.

This method reconstructs the sparse dataset using IDW, taking into account the specified number of nearest neighbors and the power to which distances are raised. The reconstructed data is returned as a dense dataset, either static or dynamic based on the input dataset.

Returns:

Type	Description
`BaseClimatrixDataset`	The reconstructed dataset on the target domain.

Notes

If fewer than self.k_min neighbors are available, NaN values are assigned to the corresponding points in the output.

`climatrix.reconstruct.kriging.OrdinaryKrigingReconstructor` ¶

Bases: BaseReconstructor

Reconstruct a sparse dataset using Ordinary Kriging.

This class performs spatial interpolation using ordinary kriging, a geostatistical method that provides optimal linear unbiased estimation by modeling spatial correlation through variograms.

Parameters:

Name	Type	Description	Default
`dataset`	`SparseDataset`	The sparse dataset to reconstruct.	required
`target_domain`	`Domain`	The target domain for reconstruction.	required
`backend`	`Literal['vectorized', 'loop'] \| None`	The backend to use for kriging (default is None).	`None`
`nlags`	`int \| None`	Number of lags for variogram computation (default is 6). Type: int, bounds: (0, ...), default: 6	`None`
`anisotropy_scaling`	`float \| None`	Anisotropy scaling factor (default is 1e-6). Type: float, bounds: , default: 1e-6	`None`
`coordinates_type`	`str \| None`	Type of coordinate system (default is "euclidean"). Type: str, values: ["euclidean", "geographic"], default: "euclidean"	`None`
`variogram_model`	`str \| None`	Variogram model to use (default is "linear"). Type: str, values: ["linear", "power", "gaussian", "spherical", "exponential"], default: "linear"	`None`
`pseudo_inv`	`bool`	Whether to use pseudo-inverse for matrix operations (default is False).	`False`

Attributes:

Name	Type	Description
`dataset`	`SparseDataset`	The sparse dataset to reconstruct.
`domain`	`Domain`	The target domain for reconstruction.
`pykrige_kwargs`	`dict`	Additional keyword arguments to pass to pykrige.
`backend`	`Literal['vectorized', 'loop'] \| None`	The backend to use for kriging.
`_MAX_VECTORIZED_SIZE`	`ClassVar[int]`	The maximum size for vectorized kriging. If the dataset is larger than this size, loop kriging will be used (if `backend` was not specified)

Notes

Hyperparameters for optimization: - nlags: int in (4, 20), default=6 - anisotropy_scaling: float in (1e-6, 5.0), default=1e-6 - coordinates_type: str in ["euclidean", "geographic"], default="euclidean" - variogram_model: str in ["linear", "power", "gaussian", "spherical", "exponential"], default="linear"

`reconstruct()` ¶

Perform Ordinary Kriging reconstruction of the dataset.

Returns:

Type	Description
`BaseClimatrixDataset`	The dataset reconstructed on the target domain.

Notes

The backend is chosen based on the size of the dataset. If the dataset is larger than the maximum size, the loop backend is used.

`climatrix.reconstruct.siren.siren.SIRENReconstructor` ¶

Bases: BaseReconstructor

A reconstructor that uses SIREN to reconstruct fields.

SIREN (Sinusoidal Representation Networks) uses sinusoidal activation functions to learn continuous implicit neural representations of spatial fields from sparse observations.

Parameters:

Name	Type	Description	Default
`dataset`	`BaseClimatrixDataset`	Source dataset to reconstruct from.	required
`target_domain`	`Domain`	Target domain to reconstruct onto.	required
`on_surface_points`	`int`	Number of points to sample on the surface for training.	`1024`
`hidden_features`	`int`	Number of features in each hidden layer.	`256`
`hidden_layers`	`int`	Number of hidden layers in the SIREN model.	`4`
`omega_0`	`float`	Frequency multiplier for the first layer.	`30.0`
`omega_hidden`	`float`	Frequency multiplier for hidden layers.	`30.0`
`lr`	`float`	Learning rate for the optimizer. Type: float, bounds: , default: 1e-3	`1e-4`
`batch_size`	`int`	Batch size for training. Type: int, bounds: , default: 256	`256`
`num_epochs`	`int`	Number of epochs to train for. Type: int, bounds: , default: 5_000	`100`
`hidden_dim`	`int`	Hidden layer dimensions. Type: int, bounds: , default: 256	`256`
`num_layers`	`int`	Number of hidden layers. Type: int, bounds: , default: 4	`4`
`num_workers`	`int`	Number of worker processes for the dataloader.	`0`
`device`	`str`	Device to run the model on ("cuda" or "cpu").	`"cuda"`
`gradient_clipping_value`	`float or None`	Value for gradient clipping (None to disable). Type: float, bounds: , default: 1.0	`None`
`checkpoint`	`str or PathLike or Path or None`	Path to save/load model checkpoint from.	`None`
`sdf_loss_weight`	`float`	Weight for the SDF constraint loss.	`3000.0`
`inter_loss_weight`	`float`	Weight for the interpolation consistency loss.	`100.0`
`normal_loss_weight`	`float`	Weight for the surface normal loss.	`100.0`
`grad_loss_weight`	`float`	Weight for the gradient regularization loss.	`50.0`

Raises:

Type	Description
`NotImplementedError`	If trying to use SIREN with a dynamic dataset.

Notes

Hyperparameters for optimization: - lr: float in (1e-5, 1e-2), default=1e-3 - batch_size: int in (64, 1024), default=256 - num_epochs: int in (100, 10_000), default=5_000 - hidden_dim: int in (128, 512), default=256 - num_layers: int in (3, 8), default=4 - gradient_clipping_value: float in (0.1, 10.0), default=1.0

`configure_optimizer(model)` ¶

Configure the optimizer for the model.

Parameters:

Name	Type	Description	Default
`model`	`Module`	The model to optimize.	required

Returns:

Type	Description
`Optimizer`	Configured Adam optimizer.

`init_model()` ¶

Initialize the 3D SIREN model.

Returns:

Type	Description
`Module`	Initialized SIREN model on the appropriate device.

`reconstruct()` ¶

Train (if necessary) and use a SIREN model to reconstruct the field.

This method is the main entry point for using the SIREN reconstructor. It will train a new model if no checkpoint was loaded, and then use the model to reconstruct the field on the target domain.

Returns:

Type	Description
`BaseClimatrixDataset`	A dataset containing the reconstructed field.

Raises:

Type	Description
`ImportError`	If required dependencies are not installed.

⚖️ Evaluation¶

`climatrix.comparison.Comparison` ¶

Class for comparing two datasets (dense or sparse).

For sparse domains, uses nearest neighbor matching with optional distance thresholds to find corresponding observations.

Attributes:

Name	Type	Description
`predicted_dataset`	`BaseClimatrixDataset`	The predicted/source dataset.
`true_dataset`	`BaseClimatrixDataset`	The true/target dataset.
`diff`	`BaseClimatrixDataset`	The difference between the predicted and true datasets.
`distance_threshold`	`(float, optional)`	Maximum distance for point correspondence in sparse domains.

Parameters:

Name	Type	Description	Default
`predicted_dataset`	`BaseClimatrixDataset`	The predicted/source dataset.	required
`true_dataset`	`BaseClimatrixDataset`	The true/target dataset.	required
`map_nan_from_source`	`bool`	If True, the NaN values from the source dataset will be mapped to the target dataset. If False, the NaN values from the target dataset will be used. Default is None, which means `False` for sparse datasets and `True` for dense datasets.	`None`
`distance_threshold`	`float`	For sparse domains, maximum distance threshold for considering points as corresponding. If None, closest points are always matched. Only used when both datasets have sparse domains.	`None`

`plot_diff(title=None, target=None, show=False, ax=None, **kwargs)` ¶

Plot the difference between the source and target datasets.

Parameters:

Name	Type	Description	Default
`title`	`str`	Title of the plot. If not provided, the name of the dataset will be used. If the dataset has no name, "Climatrix Dataset" will be used.	`None`
`target`	`str, os.PathLike, Path, or None`	Path to save the plot. If not provided, the plot will not be saved.	`None`
`show`	`bool`	Whether to show the plot. Default is False.	`False`
`ax`	`Axes`	Axes to plot on. If not provided, a new figure and axes will be created.	`None`
`**kwargs`	`dict`	Additional keyword arguments to pass to the plotting function. `figsize`: tuple, optional Size of the figure. Default is (12, 6). `vmin`: float, optional Minimum value for the color scale. Default is None. `vmax`: float, optional Maximum value for the color scale. Default is None. `cmap`: str, optional Colormap to use for the plot. Default is "seismic". `size`: int, optional Size of the points for sparse datasets. Default is 10.	`{}`

Returns:

Type	Description
`Axes`	The matplotlib axes containing the plot of the difference.

`plot_signed_diff_hist(ax=None, n_bins=50, limits=None, label=None, alpha=1.0)` ¶

Plot the histogram of signed difference between datasets.

The signed difference is a dataset where positive values represent areas where the source dataset is larger than the target dataset and negative values represent areas where the source dataset is smaller than the target dataset.

Parameters:

Name	Type	Description	Default
`ax`	`Axes`	The matplotlib axes on which to plot the histogram. If None, a new set of axes will be created.	`None`
`n_bins`	`int`	The number of bins to use in the histogram (default is 50).	`50`
`limits`	`tuple[float]`	The limits of values to include in the histogram (default is None).	`None`

Returns:

Type	Description
`Axes`	The matplotlib axes containing the plot of the signed difference.

`compute_rmse()` ¶

Compute the RMSE between the source and target datasets.

Returns:

Type	Description
`float`	The RMSE between the source and target datasets.

`compute_mae()` ¶

Compute the MAE between the source and target datasets.

Returns:

Type	Description
`float`	The mean absolute error between the source and target datasets.

`compute_r2()` ¶

Compute the R^2 between the source and target datasets.

Returns:

Type	Description
`float`	The R^2 between the source and target datasets.

`compute_max_abs_error()` ¶

Compute the maximum absolute error between datasets.

Returns:

Type	Description
`float`	The maximum absolute error between the source and target datasets.

`compute_report()` ¶

`save_report(target_dir)` ¶

Save a report of the comparison between passed datasets.

This method will create a directory at the specified path and save a report of the comparison between the source and target datasets in that directory. The report will include plots of the difference and signed difference between the datasets, as well as a csv file with metrics such as the RMSE, MAE, and maximum absolute error.

Parameters:

Name	Type	Description	Default
`target_dir`	`str \| PathLike \| Path`	The path to the directory where the report should be saved.	required

🔧 Hyperparameter Optimization¶

Climatrix provides automated hyperparameter optimization for all reconstruction methods using Bayesian optimization.

Installation¶

To use hyperparameter optimization, install climatrix with the optimization extras:

pip install climatrix[optim]

This installs the required bayesian-optimization package dependency.

HParamFinder¶

`climatrix.optim.HParamFinder` ¶

Bayesian hyperparameter optimization for reconstruction methods.

This class uses Bayesian optimization to find optimal hyperparameters for various reconstruction methods.

Parameters:

Name	Type	Description	Default
`method`	`str`	Reconstruction method to optimize.	required
`train_dset`	`BaseClimatrixDataset`	Training dataset used for optimization.	required
`val_dset`	`BaseClimatrixDataset`	Validation dataset used for optimization.	required
`metric`	`str`	Evaluation metric to optimize. Default is "mae". Supported metrics: "mae", "mse", "rmse".	`'mae'`
`exclude`	`str or Collection[str]`	Parameter(s) to exclude from optimization.	`None`
`include`	`str or Collection[str]`	Parameter(s) to include in optimization. If specified, only these parameters will be optimized.	`None`
`n_iters`	`int`	Total number of optimization iterations. Default is 100.	`100`
`bounds`	`dict`	Custom parameter bounds. Overrides default bounds for the method.	`None`
`random_seed`	`int`	Random seed for reproducible optimization. Default is 42.	`42`

Attributes:

Name	Type	Description
`train_dset`	`BaseClimatrixDataset`	Training dataset.
`val_dset`	`BaseClimatrixDataset`	Validation dataset.
`metric`	`MetricType`	Evaluation metric.
`method`	`str`	Reconstruction method.
`bounds`	`dict`	Parameter bounds for optimization.
`n_iter`	`int`	Number of optimization iterations.
`random_seed`	`int`	Random seed for optimization.
`verbose`	`int`	Verbosity level for logging (0 - silent, 1 - info, 2 - debug).
`n_startup_trials`	`int`	Number of startup trials for the optimizer.
`n_warmup_steps`	`int`	Number of warmup steps before starting optimization.
`result`	`dict`	Dictionary containing optimization results: - 'best_params': Best hyperparameters found (with correct types) - 'best_score': Best score achieved (negative metric value) - 'metric_name': Name of the optimized metric - 'method': Reconstruction method used - 'n_trials': Total number of trials performed

`optimize()` ¶

Run Bayesian optimization to find optimal hyperparameters.

Returns:

Type	Description
`dict[str, Any]`	Dictionary containing: - 'best_params': Best hyperparameters found (with correct types) - 'best_score': Best score achieved (negative metric value) - 'history': Optimization history - 'metric_name': Name of the optimized metric - 'method': Reconstruction method used

Supported Methods and Parameters¶

The hyperparameter optimizer supports all reconstruction methods available in Climatrix. For detailed information about each method's hyperparameters, including their types, bounds, and default values, see the Reference documentation for each reconstruction class:

IDWReconstructor - Inverse Distance Weighting
OrdinaryKrigingReconstructor - Ordinary Kriging
SiNETReconstructor - Spatial Interpolation NET
SIRENReconstructor - Sinusoidal INR

🧪 API Reference¶

📈 Axes¶

climatrix.dataset.axis.AxisType ¶

get(value) classmethod ¶

climatrix.dataset.axis.Axis ¶

size property ¶

matches(name) classmethod ¶

get_all_axes() classmethod ¶

climatrix.dataset.axis.Latitude ¶

climatrix.dataset.axis.Longitude ¶

climatrix.dataset.axis.Time ¶

__eq__(other) ¶

climatrix.dataset.axis.Point ¶

climatrix.dataset.axis.Vertical ¶

📇 Data¶

climatrix.dataset.base.BaseClimatrixDataset ¶

domain = Domain(xarray_obj) instance-attribute ¶

subset(north=None, south=None, west=None, east=None) ¶

to_signed_longitude() ¶

to_positive_longitude() ¶

squeeze() ¶

profile_along_axes(*axes) ¶

mask_nan(source) ¶

time(time) ¶

itime(time) ¶

sample_uniform(portion=None, number=None, nan='ignore') ¶

sample_normal(portion=None, number=None, center_point=None, sigma=10.0, nan='ignore') ¶

reconstruct(target, *, method, **recon_kwargs) ¶

plot(title=None, target=None, show=True, **kwargs) ¶

transpose(*axes) ¶

🌍 Domain¶

climatrix.dataset.domain.Domain ¶

dims property ¶

latitude property ¶

longitude property ¶

time property ¶

point property ¶

vertical property ¶

is_dynamic property ¶

is_sparse class-attribute ¶

size property ¶

all_axes_types property ¶

from_lat_lon(lat=slice(-90, 90, _DEFAULT_LAT_RESOLUTION), lon=slice(-180, 180, _DEFAULT_LON_RESOLUTION), kind='dense') classmethod ¶

from_axes() classmethod ¶

get_size(axis) ¶

has_axis(axis) ¶

get_axis(axis) ¶

get_all_spatial_points() abstractmethod ¶

to_xarray(values, name=None) abstractmethod ¶

climatrix.dataset.domain.SparseDomain ¶

to_xarray(values, name=None) ¶

get_all_spatial_points() ¶

climatrix.dataset.domain.DenseDomain ¶

to_xarray(values, name=None) ¶

get_all_spatial_points() ¶

📈 Interactive Plotting¶

climatrix.plot.core.Plot ¶

show(port=5000, debug=False) ¶

🌐 Reconstructors¶

climatrix.reconstruct.base.BaseReconstructor ¶

__init_subclass__(**kwargs) ¶

get(method) classmethod ¶

get_available_methods() classmethod ¶

get_hparams() classmethod ¶

reconstruct() abstractmethod ¶

update_bounds(bounds=None, values=None) classmethod ¶

climatrix.reconstruct.idw.IDWReconstructor ¶

reconstruct() ¶

climatrix.reconstruct.kriging.OrdinaryKrigingReconstructor ¶

reconstruct() ¶

climatrix.reconstruct.siren.siren.SIRENReconstructor ¶

configure_optimizer(model) ¶

init_model() ¶

reconstruct() ¶

⚖️ Evaluation¶

climatrix.comparison.Comparison ¶

plot_diff(title=None, target=None, show=False, ax=None, **kwargs) ¶

plot_signed_diff_hist(ax=None, n_bins=50, limits=None, label=None, alpha=1.0) ¶

compute_rmse() ¶

compute_mae() ¶

`climatrix.dataset.axis.AxisType` ¶

`get(value)` `classmethod` ¶

`climatrix.dataset.axis.Axis` ¶

`size` `property` ¶

`matches(name)` `classmethod` ¶

`get_all_axes()` `classmethod` ¶

`climatrix.dataset.axis.Latitude` ¶

`climatrix.dataset.axis.Longitude` ¶

`climatrix.dataset.axis.Time` ¶

`eq(other)` ¶

`climatrix.dataset.axis.Point` ¶

`climatrix.dataset.axis.Vertical` ¶

`climatrix.dataset.base.BaseClimatrixDataset` ¶

`domain = Domain(xarray_obj)` `instance-attribute` ¶

`subset(north=None, south=None, west=None, east=None)` ¶

`to_signed_longitude()` ¶

`to_positive_longitude()` ¶

`squeeze()` ¶

`profile_along_axes(*axes)` ¶

`mask_nan(source)` ¶

`time(time)` ¶

`itime(time)` ¶

`sample_uniform(portion=None, number=None, nan='ignore')` ¶

`sample_normal(portion=None, number=None, center_point=None, sigma=10.0, nan='ignore')` ¶

`reconstruct(target, *, method, **recon_kwargs)` ¶

`plot(title=None, target=None, show=True, **kwargs)` ¶

`transpose(*axes)` ¶

`climatrix.dataset.domain.Domain` ¶

`dims` `property` ¶

`latitude` `property` ¶

`longitude` `property` ¶

`time` `property` ¶

`point` `property` ¶

`vertical` `property` ¶

`is_dynamic` `property` ¶

`is_sparse` `class-attribute` ¶

`size` `property` ¶

`all_axes_types` `property` ¶

`from_lat_lon(lat=slice(-90, 90, _DEFAULT_LAT_RESOLUTION), lon=slice(-180, 180, _DEFAULT_LON_RESOLUTION), kind='dense')` `classmethod` ¶

`from_axes()` `classmethod` ¶

`get_size(axis)` ¶

`has_axis(axis)` ¶

`get_axis(axis)` ¶

`get_all_spatial_points()` `abstractmethod` ¶

`to_xarray(values, name=None)` `abstractmethod` ¶

`climatrix.dataset.domain.SparseDomain` ¶

`to_xarray(values, name=None)` ¶

`get_all_spatial_points()` ¶

`climatrix.dataset.domain.DenseDomain` ¶

`to_xarray(values, name=None)` ¶

`get_all_spatial_points()` ¶

`climatrix.plot.core.Plot` ¶

`show(port=5000, debug=False)` ¶

`climatrix.reconstruct.base.BaseReconstructor` ¶

`__init_subclass__(**kwargs)` ¶

`get(method)` `classmethod` ¶

`get_available_methods()` `classmethod` ¶

`get_hparams()` `classmethod` ¶

`reconstruct()` `abstractmethod` ¶

`update_bounds(bounds=None, values=None)` `classmethod` ¶

`climatrix.reconstruct.idw.IDWReconstructor` ¶

`reconstruct()` ¶

`climatrix.reconstruct.kriging.OrdinaryKrigingReconstructor` ¶

`reconstruct()` ¶

`climatrix.reconstruct.siren.siren.SIRENReconstructor` ¶

`configure_optimizer(model)` ¶

`init_model()` ¶

`reconstruct()` ¶

`climatrix.comparison.Comparison` ¶

`plot_diff(title=None, target=None, show=False, ax=None, **kwargs)` ¶

`plot_signed_diff_hist(ax=None, n_bins=50, limits=None, label=None, alpha=1.0)` ¶

`compute_rmse()` ¶

`compute_mae()` ¶

`compute_r2()` ¶

`compute_max_abs_error()` ¶

`compute_report()` ¶

`save_report(target_dir)` ¶

`climatrix.optim.HParamFinder` ¶

`optimize()` ¶