Skip to content

๐Ÿงช API Reference

Welcome to the climatrix API reference. Below you'll find details on key modules, classes, and methods โ€” with examples and usage tips to help you integrate it smoothly into your climate data workflows.


Abstract

The main module climatrix provides tools to extend xarray datasets for climate subsetting, sampling, reconstruction. It is accessible via accessor.


The library contains a few public classes:

Class name Description
AxisType Enumerator class for type of spatio-temporal axes
Axis Class managing spatio-temporal axes
BaseClimatrixDataset Base class for managing xarray data
Domain Base class for domain-specific operations
SparseDomain Subclass of Domain aim at managing sparse representations
DenseDomain Subclass of Domain aim at managing dense representations
Plot Interactive plotting utility for climate datasets

๐Ÿ“ˆ Axes

climatrix.dataset.axis.AxisType

Bases: StrEnum

Enum for axis types.

Attributes:

Name Type Description
LATITUDE str

Latitude axis type.

LONGITUDE str

Longitude axis type.

TIME str

Time axis type.

VERTICAL str

Vertical axis type.

POINT str

Point axis type.

get(value) classmethod

Get the AxisType type given by value.

If value is an instance of AxisType, return it as is. If value is a string, return the corresponding AxisType. If value is neither an instance of AxisType nor a string, raise a ValueError.

Parameters:

Name Type Description Default
value str or AxisType

The axis type

required

Returns:

Type Description
AxisType

The axis type.

Raises:

Type Description
ValueError

If value is not a valid axis type.

climatrix.dataset.axis.Axis

Base class for axis types.

Attributes:

Name Type Description
type ClassVar[AxisType]

The type of the axis.

dtype ClassVar[dtype]

The data type of the axis values.

is_dimension bool

Whether the axis is a dimension or not.

name str

The name of the axis.

values ndarray

The values of the axis.

Parameters:

Name Type Description Default
name str

The name of the axis.

required
values ndarray

The values of the axis.

required
is_dimension bool

Whether the axis is a dimension or not (default is True).

True

Examples:

Axis is a factory class for all axis types. To create an axis (by matching the name), use:

>>> axis = Axis(name="latitude", values=np.array([1, 2, 3]))

To create a Latitude axis explicitly, use:

>>> axis = Latitude(name="latitude", values=np.array([1, 2, 3]))
>>> axis = Latitude(
... name="latitude",
... values=np.array([1, 2, 3]),
... is_dimension=True)
Notes
  • The Axis class is a factory class for all axis types.
  • If the given axis has "unusual" name, you need to create it explicitly using the corresponding class (e.g. Latitude).

size property

Get the size of the axis.

Returns:

Type Description
int

The size of the axis.

matches(name) classmethod

Check if the axis matches the given name.

Parameters:

Name Type Description Default
name str

The name to check.

required

Returns:

Type Description
bool

True if the axis matches the name, False otherwise.

get_all_axes() classmethod

Get all axis classes.

Returns:

Type Description
list[Type[Axis]]

A list of all axis classes.

climatrix.dataset.axis.Latitude

Bases: Axis

Latitude axis.

Attributes:

Name Type Description
name str

The name of the latitude axis.

is_dimension bool

Whether the axis is a dimension or not.

climatrix.dataset.axis.Longitude

Bases: Axis

Longitude axis.

Attributes:

Name Type Description
name str

The name of the longitude axis.

is_dimension bool

Whether the axis is a dimension or not.

climatrix.dataset.axis.Time

Bases: Axis

Time axis.

Attributes:

Name Type Description
name str

The name of the time axis.

is_dimension bool

Whether the axis is a dimension or not.

__eq__(other)

Check if two axes are equal.

Parameters:

Name Type Description Default
other object

The other object to compare with.

required

Returns:

Type Description
bool

True if the axes are equal, False otherwise.

climatrix.dataset.axis.Point

Bases: Axis

Point axis.

Attributes:

Name Type Description
name str

The name of the point axis.

is_dimension bool

Whether the axis is a dimension or not.

climatrix.dataset.axis.Vertical

Bases: Axis

Vertical axis.

Attributes:

Name Type Description
name str

The name of the vertical axis.

is_dimension bool

Whether the axis is a dimension or not.

๐Ÿ“‡ Data

climatrix.dataset.base.BaseClimatrixDataset

Base class for Climatrix workflows.

This class provides a set of methods for manipulating xarray datasets. It is designed to be used as an xarray accessor, allowing you to call its methods directly on xarray datasets.

The class supports basic arithmetic operations, including: addition, subtraction, multiplication, and division.

Attributes:

Name Type Description
da DataArray

The underlying xarray.DataArray object (if single-variable xarray.Dataset was passed, it is squeezed to xarray.DataArray).

domain Domain

The domain object representing the spatial and temporal dimensions of the dataset. See SparseDomain and DenseDomain for more details.

domain = Domain(xarray_obj) instance-attribute

subset(north=None, south=None, west=None, east=None)

Subset data with the specified bounding box.

If an argument is not provided, it means no bounds set in that direction. For example, if north is not provided, it means that the maximum latitude of the dataset will be used. If north and south are provided, the dataset will be subsetted to the area between these two latitudes.

Parameters:

Name Type Description Default
north float

North latitude of the bounding box.

None
south float

South latitude of the bounding box.

None
west float

West longitude of the bounding box.

None
east float

East longitude of the bounding box.

None

Returns:

Type Description
Self

The subsetted dataset.

Raises:

Type Description
LongitudeConventionMismatch
  • If the dataset is in positive-only convention (longitude \(\lambda \in [0, 360]\)) and negative values are requested, or vice versa.
  • If the dataset is in signed-longitude convention (longitude \(\lambda \in [-180, 180]\)) and positive values greater than 360 are requested.

Examples:

>>> import climatrix as cm
>>> globe_dset = xr.open_dataset("path/to/dataset.nc")
>>> globe_dset
<xarray.Dataset>
Dimensions:  (time: 1, latitude: 180, longitude: 360)
Coordinates:
  * time     (time) datetime64[ns] 2020-01-01
  * latitude (latitude) float64 -90.0 -89.0 -88.0 ... 88.0 89.0
  * longitude (longitude) float64 0.0 1.0 2.0 ... 357.0 358.0 359.0
Data variables:
    temperature (time, latitude, longitude) float64 ...
>>> dset2 = globe_dset.cm.subset(
...     north=10.0,
...     south=5.0,
...     west=20.0,
...     east=25.0,
... )
>>> dset2 = globe_dset.cm.subset(
...     north=10.0,
...     south=5.0,
...     west=-50.0,
...     east=25.0,
... )
LongitudeConventionMismatch: The dataset is in positive-only convention
(longitude goes from 0 to 360) while you are
requesting negative values (longitude goes from -180 to 180).

to_signed_longitude()

Convert the dataset to signed longitude convention.

The longitude values are converted to be in the range (-180 to 180 degrees).

Examples:

>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc").cm
>>> dset.da
<xarray.DataArray 'temperature' (time: 1, latitude: 180, longitude: 360)>
...
Dimensions:  (time: 1, latitude: 180, longitude: 360)
Coordinates:
  * time     (time) datetime64[ns] 2020-01-01
  * latitude (latitude) float64 -90.0 -89.0 -88.0 ... 88.0 89.0
  * longitude (longitude) float64 0.0 1.0 2.0 ... 357.0 358.0 359.0
Data variables:
    temperature (time, latitude, longitude) float64 ...
>>> dset2 = cm.to_signed_longitude()
>>> dset2.da
<xarray.DataArray 'temperature' (time: 1, latitude: 180, longitude: 360)>
...
Dimensions:  (time: 1, latitude: 180, longitude: 360)
Coordinates:
  * time     (time) datetime64[ns] 2020-01-01
  * latitude (latitude) float64 -90.0 -89.0 -88.0 ... 88.0 89.0
  * longitude (longitude) float64 -180.0 -179.0 -178.0 ... 177.0 178.0 179.0
References

[1] Mancini, M., Walczak, J. Stojiljkovic, M., geokube: A Python library for geospatial data processing, 2024, https://doi.org/10.5281/zenodo.10597965 https://github.com/CMCC-Foundation/geokube

to_positive_longitude()

Convert the dataset to positive longitude convention.

The longitude values are converted to be in the range (0 to 360 degrees).

Examples:

>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc").cm
>>> dset.da
<xarray.DataArray 'temperature' (time: 1, latitude: 180,
longitude: 360)>
...
Dimensions:  (time: 1, latitude: 180, longitude: 360)
Coordinates:
  * time     (time) datetime64[ns] 2020-01-01
  * latitude (latitude) float64 -90.0 -89.0 -88.0 ... 88.0 89.0
  * longitude (longitude) float64 -180.0 ... 178.0 179.0
Data variables:
    temperature (time, latitude, longitude) float64 ...
>>> dset2 = dset.to_positive_longitude()
>>> dset2.da
<xarray.DataArray 'temperature' (time: 1, latitude: 180,
  longitude: 360)>
...
Dimensions:  (time: 1, latitude: 180, longitude: 360)
Coordinates:
  * time     (time) datetime64[ns] 2020-01-01
  * latitude (latitude) float64 -90.0 -89.0 -88.0 ... 88.0 89.0
  * longitude (longitude) float64 0.0 1.0 ... 357.0 358.0 359.0
Data variables:
    temperature (time, latitude, longitude) float64 ...
References

[1] Mancini, M., Walczak, J. Stojiljkovic, M., geokube: A Python library for geospatial data processing, 2024, https://doi.org/10.5281/zenodo.10597965 https://github.com/CMCC-Foundation/geokube

squeeze()

Squeeze the dataset to remove dimensions of size 1.

Returns:

Type Description
Self

The squeezed dataset.

profile_along_axes(*axes)

Generate profiles along the specified axes.

Parameters:

Name Type Description Default
*axes AxisType | str

The axes along which to generate profiles.

()

Yields:

Type Description
BaseClimatrixDataset

A dataset containing the profile along the specified axes.

mask_nan(source)

Apply NaN values from another dataset to the current one.

Parameters:

Name Type Description Default
source BaseClimatrixDataset

Dataset whose NaN values will be applied to the current one.

required

Returns:

Type Description
BaseClimatrixDataset

A new dataset with NaN values applied.

Raises:

Type Description
TypeError

If the source argument is not a BaseClimatrixDataset.

ValueError

If the domain of the source or the current dataset is sparse.

DomainMismatchError

If the domains of the source and the current dataset do not match.

Examples:

>>> import climatrix as cm
>>> dset1 = xr.open_dataset("path/to/dataset1.nc").cm
>>> dset2 = xr.open_dataset("path/to/dataset2.nc").cm
>>> dset1.mask_nan(dset2)

time(time)

Select data at a specific time or times.

Parameters:

Name Type Description Default
time datetime, np.datetime64, slice, list, or np.ndarray

Time or times to be selected.

required

Returns:

Type Description
Self

The dataset with the selected time or times.

Examples:

>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc")

Selecting by datetime object:

>>> dset.cm.time(datetime(2020, 1, 1))

Selecting by np.datetime64 object:

>>> dset.cm.time(np.datetime64("2020-01-01"))

Selecting by str object:

>>> dset.cm.time(slice("2020-01-01"))

Selecting by list of any of the above:

>>> dset.cm.time([datetime(2020, 1, 1), np.datetime64("2020-01-02")])

Selecting by slice object:

>>> dset.cm.time(slice(datetime(2020, 1, 1), datetime(2020, 1, 2)))

itime(time)

Select time value by index.

Parameters:

Name Type Description Default
time int, list[int], np.ndarray, or slice

Time index or indices to be selected.

required

Returns:

Type Description
Self

The dataset with the selected time or times.

Examples:

>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc")

Selecting by int object:

>>> dset.cm.itime(0)

Selecting by list of ints:

>>> dset.cm.itime([0, 1])

Selecting by slice object:

>>> dset.cm.itime(slice(0, 2))

sample_uniform(portion=None, number=None, nan='ignore')

Sample the dataset using a uniform distribution.

Parameters:

Name Type Description Default
portion float

Portion of the dataset to be sampled.

None
number int

Number of points to be sampled.

None
nan SamplingNaNPolicy | str

Policy for handling NaN values.

'ignore'
Notes

At least one of portion or number must be provided. Cannot be provided both at the same time.

Warns:

Type Description
TooLargeSamplePortionWarning

If the portion exceeds 1.0 or number of points exceeds the number of spatial points in the Domain

Raises:

Type Description
ValueError

If the dataset contains NaN values and nan parameter (NaN handling policy) is set to SamplingNaNPolicy.RAISE.

Examples:

>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc")
>>> sparse_dset = dset.cm.sample_uniform(portion=0.1)

sample_normal(portion=None, number=None, center_point=None, sigma=10.0, nan='ignore')

Sample the dataset using a normal distribution.

Parameters:

Name Type Description Default
portion float

Portion of the dataset to be sampled.

None
number int

Number of points to be sampled.

None
center_point tuple[Longitude, Latitude]

Center point for the normal distribution.

None
sigma float

Standard deviation for the normal distribution.

10.0
nan SamplingNaNPolicy | str

Policy for handling NaN values.

'ignore'
Notes

At least one of portion or number must be provided. Cannot be provided both at the same time.

Warns:

Type Description
TooLargeSamplePortionWarning

If the portion exceeds 1.0 or number of points exceeds the number of spatial points in the Domain

Raises:

Type Description
ValueError

If the dataset contains NaN values and nan parameter (NaN handling policy) is set to SamplingNaNPolicy.RAISE.

Examples:

>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc")
>>> sparse_dset = dset.cm.sample_normal(
...     number=1_000,
...     center_point=(10.0, 20.0),
...     sigma=5.0,
... )

reconstruct(target, *, method, **recon_kwargs)

Reconstruct the dataset to a target domain.

If target domain is sparse, the reconstruction will be sparse too. If target domain is dense, the reconstruction will be dense too. The reconstruction will be done using the method specified in the method argument.

The method can be one of the following: Inverse Distance Weightining (idw), Ordinary Kriging (ok).

Parameters:

Name Type Description Default
target Domain

The target domain to reconstruct the dataset to.

required
method ReconstructionType | str

The method to use for reconstruction. Can be one of the following: 'idw', 'ok'.

required
recon_kwargs dict

Additional keyword arguments to pass to the reconstruction method.

{}
See Also

Returns:

Type Description
Self

The reconstructed dataset.

plot(title=None, target=None, show=True, **kwargs)

Plot the dataset on a map.

The dataset is plotted using Cartopy and Matplotlib.

Parameters:

Name Type Description Default
title str

Title of the plot. If not provided, the name of the dataset will be used. If the dataset has no name, "Climatrix Dataset" will be used.

None
target str, os.PathLike, Path, or None

Path to save the plot. If not provided, the plot will not be saved.

None
show bool

Whether to show the plot. Default is True.

True
**kwargs dict

Additional keyword arguments to pass to the plotting function.

  • figsize: tuple, optional Size of the figure. Default is (12, 6).
  • vmin: float, optional Minimum value for the color scale. Default is None.
  • vmax: float, optional Maximum value for the color scale. Default is None.
  • cmap: str, optional Colormap to use for the plot. Default is "seismic".
  • ax: Axes, optional Axes to plot on. If not provided, a new figure and axes will be created.
  • size: int, optional Size of the points for sparse datasets. Default is 10.
{}

Returns:

Type Description
Axes

The axes object containing the plot.

Raises:

Type Description
NotImplementedError

If the dataset is dynamic (contains time dimension with more than one value).

transpose(*axes)

Transpose the dataset along the specified dimensions.

Parameters:

Name Type Description Default
*axes AxisType or str

The axes along which to transpose the dataset.

()

Returns:

Type Description
Self

The transposed dataset.

Examples:

>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc").cm
>>> dset2 = dset.transpose("longitude", "latitude")

๐ŸŒ Domain

climatrix.dataset.domain.Domain

Base class for domain objects.

Attributes:

Name Type Description
is_sparse ClassVar[bool]

Indicates if the domain is sparse or dense.

_axes dict[AxisType, Axis]

Mapping of AxisType to the corresponding Axis object.

dims property

Get the dimensions of the dataset.

Returns:

Type Description
tuple[AxisType, ...]

A tuple of AxisType objects representing the dimensions of the dataset.

Notes

The dimensions are determined by the axes that are marked as dimensions in the domain. E.g. if underlying dataset has shape (5, 10, 20), it means there are 3 dimensional axes.

latitude property

Latitude axis

longitude property

Longitude axis

time property

Time axis

point property

Point axis

vertical property

Vertical axis

is_dynamic property

If the domain is dynamic.

is_sparse class-attribute

size property

Domain size.

all_axes_types property

All axis types in the domain.

from_lat_lon(lat=slice(-90, 90, _DEFAULT_LAT_RESOLUTION), lon=slice(-180, 180, _DEFAULT_LON_RESOLUTION), kind='dense') classmethod

Create a domain from latitude and longitude coordinates.

Parameters:

Name Type Description Default
lat slice or ndarray

Latitude coordinates. If a slice is provided, it will be converted to a numpy array using the specified step.

slice(-90, 90, _DEFAULT_LAT_RESOLUTION)
lon slice or ndarray

Longitude coordinates. If a slice is provided, it will be converted to a numpy array using the specified step.

slice(-180, 180, _DEFAULT_LON_RESOLUTION)
kind str

Type of domain to create. Can be either "dense" or "sparse". Default is "dense".

'dense'

Returns:

Type Description
Domain

An instance of the Domain class with the specified latitude and longitude coordinates.

from_axes() classmethod

Create a domain builder for configuring domains with multiple axes.

Returns:

Type Description
DomainBuilder

A builder instance for creating domains with various axes.

Examples:

>>> domain = (Domain.from_axes()
...           .vertical(depth=slice(10, 100, 1))
...           .lat(latitude=[1,2,3,4])
...           .lon(longitude=[1,2,3,4])
...           .sparse())
>>> domain = (Domain.from_axes()
...           .lat(lat=slice(-90, 90, 1))
...           .lon(lon=slice(-180, 180, 1))
...           .time(time=['2020-01-01', '2020-01-02'])
...           .dense())

get_size(axis)

Get the size of the specified axis.

Parameters:

Name Type Description Default
axis AxisType

The axis for which to get the size.

required

Returns:

Type Description
int

The size of the specified axis.

has_axis(axis)

Check if the specified axis exists in the domain.

Parameters:

Name Type Description Default
axis AxisType

The axis type to check.

required

Returns:

Type Description
bool

True if the axis exists, False otherwise.

get_axis(axis)

Get the name of the specified axis.

Parameters:

Name Type Description Default
axis AxisType

The axis type for which to get the name.

required

Returns:

Type Description
Axis | None

The Axis object, or None if not found.

get_all_spatial_points() abstractmethod

to_xarray(values, name=None) abstractmethod

climatrix.dataset.domain.SparseDomain

Bases: Domain

Sparse domain class.

Supports operations on sparse spatial domain.

to_xarray(values, name=None)

Convert domain to sparse xarray.DataArray.

The method applies values and (optionally) name to create a new xarray.DataArray object based on the domain.

Parameters:

Name Type Description Default
values ndarray

The values to be assigned to the DataArray variable.

required
name str

The name of the DataArray variable.

None

Returns:

Type Description
DataArray

The xarray.DataArray single variable object.

Raises:

Type Description
ValueError

If the shape of values does not match the expected shape.

Examples:

>>> domain = Domain.from_lat_lon()
>>> values = np.random.rand(5, 5)
>>> da = domain.to_xarray(values, name="example")
>>> isinstance(da, xr.DataArray)
True
>>> da.name
'example'

get_all_spatial_points()

Get all spatial points in the domain.

Returns:

Type Description
ndarray

An array of shape (n_points, 2) containing the latitude and longitude coordinates of all points in the domain.

Examples:

>>> points = domain.get_all_spatial_points()
>>> points
array([[ 0. , -0.1],
       [ 0. ,  0. ],
       [ 0. ,  0.1],
       ...

climatrix.dataset.domain.DenseDomain

Bases: Domain

Dense domain class.

Supports operations on dense spatial domain.

to_xarray(values, name=None)

Convert domain to dense xarray.DataArray.

The method applies values and (optionally) name to create a new xarray.DataArray object based on the domain.

Parameters:

Name Type Description Default
values ndarray

The values to be assigned to the DataArray variable.

required
name str

The name of the DataArray variable.

None

Returns:

Type Description
DataArray

The xarray.DataArray single variable object.

Raises:

Type Description
ValueError

If the shape of values does not match the expected shape.

Examples:

>>> domain = Domain.from_lat_lon()
>>> values = np.random.rand(5, 5)
>>> da = domain.to_xarray(values, name="example")
>>> isinstance(da, xr.DataArray)
True
>>> da.name
'example'

get_all_spatial_points()

Get all spatial points in the domain.

Returns:

Type Description
ndarray

An array of shape (n_points, 2) containing the latitude and longitude coordinates of all points in the domain.

Examples:

>>> points = domain.get_all_spatial_points()
>>> points
array([[ 0. , -0.1],
       [ 0. ,  0. ],
       [ 0. ,  0.1],
       ...

๐Ÿ“ˆ Interactive Plotting

climatrix.plot.core.Plot

show(port=5000, debug=False)

๐ŸŒ Reconstructors

climatrix.reconstruct.base.BaseReconstructor

Bases: ABC

Base class for all dataset reconstruction methods.

Attributes:

Name Type Description
dataset BaseClimatrixDataset

The dataset to be reconstructed.

target_domain Domain

The target domain for the reconstruction.

__init_subclass__(**kwargs)

Register subclasses automatically.

get(method) classmethod

Get a reconstruction class by method name.

Parameters:

Name Type Description Default
method str

The reconstruction method name (e.g., 'idw', 'ok', 'sinet', 'siren').

required

Returns:

Type Description
type[BaseReconstructor]

The reconstruction class.

Raises:

Type Description
ValueError

If the method is not supported.

Notes

The method parameter should reflect the NAME class attribute of the selected reconstructor class.

get_available_methods() classmethod

Get a list of available reconstruction methods.

Returns:

Type Description
list[str]

List of method names (e.g., 'idw', 'ok', 'sinet', 'siren').

get_hparams() classmethod

Get hyperparameter definitions from Hyperparameter descriptors.

Returns:

Type Description
dict[str, dict[str, any]]

Dictionary mapping parameter names to their definitions. Each parameter definition contains: - 'type': the parameter type - 'bounds': tuple of (min, max) for numeric parameters (if defined) - 'values': list of valid values for categorical parameters (if defined) - 'default': default value (if defined)

reconstruct() abstractmethod

Reconstruct the dataset using the specified method.

This is an abstract method that must be implemented by subclasses.

The data are reconstructed for the target domain, passed in the initializer.

Returns:

Type Description
BaseClimatrixDataset

The reconstructed dataset.

update_bounds(bounds=None, values=None) classmethod

Update the bounds of hyperparameters in the class.

If bound is defined as tuple, it represents a range (min, max). If as a list, it represents a set of valid values.

Parameters:

Name Type Description Default
**bounds dict[str, tuple]

Keyword arguments where keys are hyperparameter names and values are tuples defining new bounds.

None

climatrix.reconstruct.idw.IDWReconstructor

Bases: BaseReconstructor

Inverse Distance Weighting Reconstructor

This class performs spatial interpolation using inverse distance weighting, where the influence of each known data point on the interpolated value decreases with distance according to a power function.

Parameters:

Name Type Description Default
dataset BaseClimatrixDataset

The input dataset to reconstruct.

required
target_domain Domain

The target domain for reconstruction.

required
power float

The power to raise the distance to (default is 2.0). Controls the rate of decrease of influence with distance. Type: float, bounds: , default: 2.0

None
k int

The number of nearest neighbors to consider (default is 5). Type: int, bounds: (1, ...), default: 5

None
k_min int

The minimum number of nearest neighbors to consider (if k < k_min) NaN values will be put (default is 2). Type: int, bounds: (1, ...)>, default: 2

None

Raises:

Type Description
NotImplementedError

If the input dataset is dynamic, as IDW reconstruction is not yet supported for dynamic datasets.

ValueError

If k_min is greater than k or if k is less than 1.

Notes

Hyperparameters for optimization: - power: float in (1e-10, 5.0), default=2.0 - k: int in (1, 50), default=5 - k_min: int in (1, 40), default=2

reconstruct()

Perform Inverse Distance Weighting (IDW) reconstruction.

This method reconstructs the sparse dataset using IDW, taking into account the specified number of nearest neighbors and the power to which distances are raised. The reconstructed data is returned as a dense dataset, either static or dynamic based on the input dataset.

Returns:

Type Description
BaseClimatrixDataset

The reconstructed dataset on the target domain.

Notes
  • If fewer than self.k_min neighbors are available, NaN values are assigned to the corresponding points in the output.

climatrix.reconstruct.kriging.OrdinaryKrigingReconstructor

Bases: BaseReconstructor

Reconstruct a sparse dataset using Ordinary Kriging.

This class performs spatial interpolation using ordinary kriging, a geostatistical method that provides optimal linear unbiased estimation by modeling spatial correlation through variograms.

Parameters:

Name Type Description Default
dataset SparseDataset

The sparse dataset to reconstruct.

required
target_domain Domain

The target domain for reconstruction.

required
backend Literal['vectorized', 'loop'] | None

The backend to use for kriging (default is None).

None
nlags int | None

Number of lags for variogram computation (default is 6). Type: int, bounds: (0, ...), default: 6

None
anisotropy_scaling float | None

Anisotropy scaling factor (default is 1e-6). Type: float, bounds: , default: 1e-6

None
coordinates_type str | None

Type of coordinate system (default is "euclidean"). Type: str, values: ["euclidean", "geographic"], default: "euclidean"

None
variogram_model str | None

Variogram model to use (default is "linear"). Type: str, values: ["linear", "power", "gaussian", "spherical", "exponential"], default: "linear"

None
pseudo_inv bool

Whether to use pseudo-inverse for matrix operations (default is False).

False

Attributes:

Name Type Description
dataset SparseDataset

The sparse dataset to reconstruct.

domain Domain

The target domain for reconstruction.

pykrige_kwargs dict

Additional keyword arguments to pass to pykrige.

backend Literal['vectorized', 'loop'] | None

The backend to use for kriging.

_MAX_VECTORIZED_SIZE ClassVar[int]

The maximum size for vectorized kriging. If the dataset is larger than this size, loop kriging will be used (if backend was not specified)

Notes

Hyperparameters for optimization: - nlags: int in (4, 20), default=6 - anisotropy_scaling: float in (1e-6, 5.0), default=1e-6 - coordinates_type: str in ["euclidean", "geographic"], default="euclidean" - variogram_model: str in ["linear", "power", "gaussian", "spherical", "exponential"], default="linear"

reconstruct()

Perform Ordinary Kriging reconstruction of the dataset.

Returns:

Type Description
BaseClimatrixDataset

The dataset reconstructed on the target domain.

Notes
  • The backend is chosen based on the size of the dataset. If the dataset is larger than the maximum size, the loop backend is used.

climatrix.reconstruct.siren.siren.SIRENReconstructor

Bases: BaseReconstructor

A reconstructor that uses SIREN to reconstruct fields.

SIREN (Sinusoidal Representation Networks) uses sinusoidal activation functions to learn continuous implicit neural representations of spatial fields from sparse observations.

Parameters:

Name Type Description Default
dataset BaseClimatrixDataset

Source dataset to reconstruct from.

required
target_domain Domain

Target domain to reconstruct onto.

required
on_surface_points int

Number of points to sample on the surface for training.

1024
hidden_features int

Number of features in each hidden layer.

256
hidden_layers int

Number of hidden layers in the SIREN model.

4
omega_0 float

Frequency multiplier for the first layer.

30.0
omega_hidden float

Frequency multiplier for hidden layers.

30.0
lr float

Learning rate for the optimizer. Type: float, bounds: , default: 1e-3

1e-4
batch_size int

Batch size for training. Type: int, bounds: , default: 256

256
num_epochs int

Number of epochs to train for. Type: int, bounds: , default: 5_000

100
hidden_dim int

Hidden layer dimensions. Type: int, bounds: , default: 256

256
num_layers int

Number of hidden layers. Type: int, bounds: , default: 4

4
num_workers int

Number of worker processes for the dataloader.

0
device str

Device to run the model on ("cuda" or "cpu").

"cuda"
gradient_clipping_value float or None

Value for gradient clipping (None to disable). Type: float, bounds: , default: 1.0

None
checkpoint str or PathLike or Path or None

Path to save/load model checkpoint from.

None
sdf_loss_weight float

Weight for the SDF constraint loss.

3000.0
inter_loss_weight float

Weight for the interpolation consistency loss.

100.0
normal_loss_weight float

Weight for the surface normal loss.

100.0
grad_loss_weight float

Weight for the gradient regularization loss.

50.0

Raises:

Type Description
NotImplementedError

If trying to use SIREN with a dynamic dataset.

Notes

Hyperparameters for optimization: - lr: float in (1e-5, 1e-2), default=1e-3 - batch_size: int in (64, 1024), default=256 - num_epochs: int in (100, 10_000), default=5_000 - hidden_dim: int in (128, 512), default=256 - num_layers: int in (3, 8), default=4 - gradient_clipping_value: float in (0.1, 10.0), default=1.0

configure_optimizer(model)

Configure the optimizer for the model.

Parameters:

Name Type Description Default
model Module

The model to optimize.

required

Returns:

Type Description
Optimizer

Configured Adam optimizer.

init_model()

Initialize the 3D SIREN model.

Returns:

Type Description
Module

Initialized SIREN model on the appropriate device.

reconstruct()

Train (if necessary) and use a SIREN model to reconstruct the field.

This method is the main entry point for using the SIREN reconstructor. It will train a new model if no checkpoint was loaded, and then use the model to reconstruct the field on the target domain.

Returns:

Type Description
BaseClimatrixDataset

A dataset containing the reconstructed field.

Raises:

Type Description
ImportError

If required dependencies are not installed.

โš–๏ธ Evaluation

climatrix.comparison.Comparison

Class for comparing two datasets (dense or sparse).

For sparse domains, uses nearest neighbor matching with optional distance thresholds to find corresponding observations.

Attributes:

Name Type Description
predicted_dataset BaseClimatrixDataset

The predicted/source dataset.

true_dataset BaseClimatrixDataset

The true/target dataset.

diff BaseClimatrixDataset

The difference between the predicted and true datasets.

distance_threshold (float, optional)

Maximum distance for point correspondence in sparse domains.

Parameters:

Name Type Description Default
predicted_dataset BaseClimatrixDataset

The predicted/source dataset.

required
true_dataset BaseClimatrixDataset

The true/target dataset.

required
map_nan_from_source bool

If True, the NaN values from the source dataset will be mapped to the target dataset. If False, the NaN values from the target dataset will be used. Default is None, which means False for sparse datasets and True for dense datasets.

None
distance_threshold float

For sparse domains, maximum distance threshold for considering points as corresponding. If None, closest points are always matched. Only used when both datasets have sparse domains.

None

plot_diff(title=None, target=None, show=False, ax=None, **kwargs)

Plot the difference between the source and target datasets.

Parameters:

Name Type Description Default
title str

Title of the plot. If not provided, the name of the dataset will be used. If the dataset has no name, "Climatrix Dataset" will be used.

None
target str, os.PathLike, Path, or None

Path to save the plot. If not provided, the plot will not be saved.

None
show bool

Whether to show the plot. Default is False.

False
ax Axes

Axes to plot on. If not provided, a new figure and axes will be created.

None
**kwargs dict

Additional keyword arguments to pass to the plotting function.

  • figsize: tuple, optional Size of the figure. Default is (12, 6).
  • vmin: float, optional Minimum value for the color scale. Default is None.
  • vmax: float, optional Maximum value for the color scale. Default is None.
  • cmap: str, optional Colormap to use for the plot. Default is "seismic".
  • size: int, optional Size of the points for sparse datasets. Default is 10.
{}

Returns:

Type Description
Axes

The matplotlib axes containing the plot of the difference.

plot_signed_diff_hist(ax=None, n_bins=50, limits=None, label=None, alpha=1.0)

Plot the histogram of signed difference between datasets.

The signed difference is a dataset where positive values represent areas where the source dataset is larger than the target dataset and negative values represent areas where the source dataset is smaller than the target dataset.

Parameters:

Name Type Description Default
ax Axes

The matplotlib axes on which to plot the histogram. If None, a new set of axes will be created.

None
n_bins int

The number of bins to use in the histogram (default is 50).

50
limits tuple[float]

The limits of values to include in the histogram (default is None).

None

Returns:

Type Description
Axes

The matplotlib axes containing the plot of the signed difference.

compute_rmse()

Compute the RMSE between the source and target datasets.

Returns:

Type Description
float

The RMSE between the source and target datasets.

compute_mae()

Compute the MAE between the source and target datasets.

Returns:

Type Description
float

The mean absolute error between the source and target datasets.

compute_r2()

Compute the R^2 between the source and target datasets.

Returns:

Type Description
float

The R^2 between the source and target datasets.

compute_max_abs_error()

Compute the maximum absolute error between datasets.

Returns:

Type Description
float

The maximum absolute error between the source and target datasets.

compute_report()

save_report(target_dir)

Save a report of the comparison between passed datasets.

This method will create a directory at the specified path and save a report of the comparison between the source and target datasets in that directory. The report will include plots of the difference and signed difference between the datasets, as well as a csv file with metrics such as the RMSE, MAE, and maximum absolute error.

Parameters:

Name Type Description Default
target_dir str | PathLike | Path

The path to the directory where the report should be saved.

required

๐Ÿ”ง Hyperparameter Optimization

Climatrix provides automated hyperparameter optimization for all reconstruction methods using Bayesian optimization.

Installation

To use hyperparameter optimization, install climatrix with the optimization extras:

pip install climatrix[optim]

This installs the required bayesian-optimization package dependency.

HParamFinder

climatrix.optim.HParamFinder

Bayesian hyperparameter optimization for reconstruction methods.

This class uses Bayesian optimization to find optimal hyperparameters for various reconstruction methods.

Parameters:

Name Type Description Default
method str

Reconstruction method to optimize.

required
train_dset BaseClimatrixDataset

Training dataset used for optimization.

required
val_dset BaseClimatrixDataset

Validation dataset used for optimization.

required
metric str

Evaluation metric to optimize. Default is "mae". Supported metrics: "mae", "mse", "rmse".

'mae'
exclude str or Collection[str]

Parameter(s) to exclude from optimization.

None
include str or Collection[str]

Parameter(s) to include in optimization. If specified, only these parameters will be optimized.

None
n_iters int

Total number of optimization iterations. Default is 100.

100
bounds dict

Custom parameter bounds. Overrides default bounds for the method.

None
random_seed int

Random seed for reproducible optimization. Default is 42.

42

Attributes:

Name Type Description
train_dset BaseClimatrixDataset

Training dataset.

val_dset BaseClimatrixDataset

Validation dataset.

metric MetricType

Evaluation metric.

method str

Reconstruction method.

bounds dict

Parameter bounds for optimization.

n_iter int

Number of optimization iterations.

random_seed int

Random seed for optimization.

verbose int

Verbosity level for logging (0 - silent, 1 - info, 2 - debug).

n_startup_trials int

Number of startup trials for the optimizer.

n_warmup_steps int

Number of warmup steps before starting optimization.

result dict

Dictionary containing optimization results: - 'best_params': Best hyperparameters found (with correct types) - 'best_score': Best score achieved (negative metric value) - 'metric_name': Name of the optimized metric - 'method': Reconstruction method used - 'n_trials': Total number of trials performed

optimize()

Run Bayesian optimization to find optimal hyperparameters.

Returns:

Type Description
dict[str, Any]

Dictionary containing: - 'best_params': Best hyperparameters found (with correct types) - 'best_score': Best score achieved (negative metric value) - 'history': Optimization history - 'metric_name': Name of the optimized metric - 'method': Reconstruction method used

Supported Methods and Parameters

The hyperparameter optimizer supports all reconstruction methods available in Climatrix. For detailed information about each method's hyperparameters, including their types, bounds, and default values, see the Reference documentation for each reconstruction class: