๐งช API Reference¶
Welcome to the climatrix
API reference. Below you'll find details on key modules, classes, and methods โ with examples and usage tips to help you integrate it smoothly into your climate data workflows.
Abstract
The main module climatrix
provides tools to extend xarray
datasets for climate subsetting, sampling, reconstruction. It is accessible via accessor.
The library contains a few public classes:
Class name | Description |
---|---|
AxisType |
Enumerator class for type of spatio-temporal axes |
Axis |
Class managing spatio-temporal axes |
BaseClimatrixDataset |
Base class for managing xarray data |
Domain |
Base class for domain-specific operations |
SparseDomain |
Subclass of Domain aim at managing sparse representations |
DenseDomain |
Subclass of Domain aim at managing dense representations |
๐ Axes¶
climatrix.dataset.axis.AxisType
¶
Bases: StrEnum
Enum for axis types.
Attributes:
Name | Type | Description |
---|---|---|
LATITUDE |
str
|
Latitude axis type. |
LONGITUDE |
str
|
Longitude axis type. |
TIME |
str
|
Time axis type. |
VERTICAL |
str
|
Vertical axis type. |
POINT |
str
|
Point axis type. |
get(value)
classmethod
¶
Get the AxisType
type given by value
.
If value
is an instance of AxisType
,
return it as is.
If value
is a string, return the corresponding
AxisType
.
If value
is neither an instance of AxisType
nor a string, raise a ValueError.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
value
|
str or AxisType
|
The axis type |
required |
Returns:
Type | Description |
---|---|
AxisType
|
The axis type. |
Raises:
Type | Description |
---|---|
ValueError
|
If |
climatrix.dataset.axis.Axis
¶
Base class for axis types.
Attributes:
Name | Type | Description |
---|---|---|
type |
ClassVar[AxisType]
|
The type of the axis. |
dtype |
ClassVar[dtype]
|
The data type of the axis values. |
is_dimension |
bool
|
Whether the axis is a dimension or not. |
name |
str
|
The name of the axis. |
values |
ndarray
|
The values of the axis. |
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The name of the axis. |
required |
values
|
ndarray
|
The values of the axis. |
required |
is_dimension
|
bool
|
Whether the axis is a dimension or not (default is True). |
True
|
Examples:
Axis is a factory class for all axis types. To create an axis (by matching the name), use:
>>> axis = Axis(name="latitude", values=np.array([1, 2, 3]))
To create a Latitude
axis explicitly, use:
>>> axis = Latitude(name="latitude", values=np.array([1, 2, 3]))
>>> axis = Latitude(
... name="latitude",
... values=np.array([1, 2, 3]),
... is_dimension=True)
Notes
- The
Axis
class is a factory class for all axis types. - If the given axis has "unusual" name, you need to create it
explicitly using the corresponding class (e.g.
Latitude
).
size
property
¶
Get the size of the axis.
Returns:
Type | Description |
---|---|
int
|
The size of the axis. |
matches(name)
classmethod
¶
Check if the axis matches the given name.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The name to check. |
required |
Returns:
Type | Description |
---|---|
bool
|
True if the axis matches the name, False otherwise. |
climatrix.dataset.axis.Latitude
¶
Bases: Axis
Latitude axis.
Attributes:
Name | Type | Description |
---|---|---|
name |
str
|
The name of the latitude axis. |
is_dimension |
bool
|
Whether the axis is a dimension or not. |
climatrix.dataset.axis.Longitude
¶
Bases: Axis
Longitude axis.
Attributes:
Name | Type | Description |
---|---|---|
name |
str
|
The name of the longitude axis. |
is_dimension |
bool
|
Whether the axis is a dimension or not. |
climatrix.dataset.axis.Time
¶
Bases: Axis
Time axis.
Attributes:
Name | Type | Description |
---|---|---|
name |
str
|
The name of the time axis. |
is_dimension |
bool
|
Whether the axis is a dimension or not. |
__eq__(other)
¶
Check if two axes are equal.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other
|
object
|
The other object to compare with. |
required |
Returns:
Type | Description |
---|---|
bool
|
True if the axes are equal, False otherwise. |
climatrix.dataset.axis.Point
¶
Bases: Axis
Point axis.
Attributes:
Name | Type | Description |
---|---|---|
name |
str
|
The name of the point axis. |
is_dimension |
bool
|
Whether the axis is a dimension or not. |
climatrix.dataset.axis.Vertical
¶
Bases: Axis
Vertical axis.
Attributes:
Name | Type | Description |
---|---|---|
name |
str
|
The name of the vertical axis. |
is_dimension |
bool
|
Whether the axis is a dimension or not. |
๐ Data¶
climatrix.dataset.base.BaseClimatrixDataset
¶
Base class for Climatrix workflows.
This class provides a set of methods for manipulating xarray datasets. It is designed to be used as an xarray accessor, allowing you to call its methods directly on xarray datasets.
The class supports basic arithmetic operations, including: addition, subtraction, multiplication, and division.
Attributes:
Name | Type | Description |
---|---|---|
da |
DataArray
|
The underlying |
domain |
Domain
|
The domain object representing the spatial
and temporal dimensions of the dataset.
See |
domain = Domain(xarray_obj)
instance-attribute
¶
subset(north=None, south=None, west=None, east=None)
¶
Subset data with the specified bounding box.
If an argument is not provided, it means no bounds set
in that direction. For example, if north
is not provided,
it means that the maximum latitude of the dataset will be used.
If north
and south
are provided, the dataset will be
subsetted to the area between these two latitudes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
north
|
float
|
North latitude of the bounding box. |
None
|
south
|
float
|
South latitude of the bounding box. |
None
|
west
|
float
|
West longitude of the bounding box. |
None
|
east
|
float
|
East longitude of the bounding box. |
None
|
Returns:
Type | Description |
---|---|
Self
|
The subsetted dataset. |
Raises:
Type | Description |
---|---|
LongitudeConventionMismatch
|
|
Examples:
>>> import climatrix as cm
>>> globe_dset = xr.open_dataset("path/to/dataset.nc")
>>> globe_dset
<xarray.Dataset>
Dimensions: (time: 1, latitude: 180, longitude: 360)
Coordinates:
* time (time) datetime64[ns] 2020-01-01
* latitude (latitude) float64 -90.0 -89.0 -88.0 ... 88.0 89.0
* longitude (longitude) float64 0.0 1.0 2.0 ... 357.0 358.0 359.0
Data variables:
temperature (time, latitude, longitude) float64 ...
>>> dset2 = globe_dset.cm.subset(
... north=10.0,
... south=5.0,
... west=20.0,
... east=25.0,
... )
>>> dset2 = globe_dset.cm.subset(
... north=10.0,
... south=5.0,
... west=-50.0,
... east=25.0,
... )
LongitudeConventionMismatch: The dataset is in positive-only convention
(longitude goes from 0 to 360) while you are
requesting negative values (longitude goes from -180 to 180).
to_signed_longitude()
¶
Convert the dataset to signed longitude convention.
The longitude values are converted to be in the range (-180 to 180 degrees).
Examples:
>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc").cm
>>> dset.da
<xarray.DataArray 'temperature' (time: 1, latitude: 180, longitude: 360)>
...
Dimensions: (time: 1, latitude: 180, longitude: 360)
Coordinates:
* time (time) datetime64[ns] 2020-01-01
* latitude (latitude) float64 -90.0 -89.0 -88.0 ... 88.0 89.0
* longitude (longitude) float64 0.0 1.0 2.0 ... 357.0 358.0 359.0
Data variables:
temperature (time, latitude, longitude) float64 ...
>>> dset2 = cm.to_signed_longitude()
>>> dset2.da
<xarray.DataArray 'temperature' (time: 1, latitude: 180, longitude: 360)>
...
Dimensions: (time: 1, latitude: 180, longitude: 360)
Coordinates:
* time (time) datetime64[ns] 2020-01-01
* latitude (latitude) float64 -90.0 -89.0 -88.0 ... 88.0 89.0
* longitude (longitude) float64 -180.0 -179.0 -178.0 ... 177.0 178.0 179.0
References
[1] Mancini, M., Walczak, J. Stojiljkovic, M., geokube: A Python library for geospatial data processing, 2024, https://doi.org/10.5281/zenodo.10597965 https://github.com/CMCC-Foundation/geokube
to_positive_longitude()
¶
Convert the dataset to positive longitude convention.
The longitude values are converted to be in the range (0 to 360 degrees).
Examples:
>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc").cm
>>> dset.da
<xarray.DataArray 'temperature' (time: 1, latitude: 180,
longitude: 360)>
...
Dimensions: (time: 1, latitude: 180, longitude: 360)
Coordinates:
* time (time) datetime64[ns] 2020-01-01
* latitude (latitude) float64 -90.0 -89.0 -88.0 ... 88.0 89.0
* longitude (longitude) float64 -180.0 ... 178.0 179.0
Data variables:
temperature (time, latitude, longitude) float64 ...
>>> dset2 = dset.to_positive_longitude()
>>> dset2.da
<xarray.DataArray 'temperature' (time: 1, latitude: 180,
longitude: 360)>
...
Dimensions: (time: 1, latitude: 180, longitude: 360)
Coordinates:
* time (time) datetime64[ns] 2020-01-01
* latitude (latitude) float64 -90.0 -89.0 -88.0 ... 88.0 89.0
* longitude (longitude) float64 0.0 1.0 ... 357.0 358.0 359.0
Data variables:
temperature (time, latitude, longitude) float64 ...
References
[1] Mancini, M., Walczak, J. Stojiljkovic, M., geokube: A Python library for geospatial data processing, 2024, https://doi.org/10.5281/zenodo.10597965 https://github.com/CMCC-Foundation/geokube
squeeze()
¶
Squeeze the dataset to remove dimensions of size 1.
Returns:
Type | Description |
---|---|
Self
|
The squeezed dataset. |
profile_along_axes(*axes)
¶
Generate profiles along the specified axes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*axes
|
AxisType | str
|
The axes along which to generate profiles. |
()
|
Yields:
Type | Description |
---|---|
BaseClimatrixDataset
|
A dataset containing the profile along the specified axes. |
mask_nan(source)
¶
Apply NaN values from another dataset to the current one.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source
|
BaseClimatrixDataset
|
Dataset whose NaN values will be applied to the current one. |
required |
Returns:
Type | Description |
---|---|
BaseClimatrixDataset
|
A new dataset with NaN values applied. |
Raises:
Type | Description |
---|---|
TypeError
|
If the |
ValueError
|
If the domain of the |
DomainMismatchError
|
If the domains of the |
Examples:
>>> import climatrix as cm
>>> dset1 = xr.open_dataset("path/to/dataset1.nc").cm
>>> dset2 = xr.open_dataset("path/to/dataset2.nc").cm
>>> dset1.mask_nan(dset2)
time(time)
¶
Select data at a specific time or times.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
time
|
datetime, np.datetime64, slice, list, or np.ndarray
|
Time or times to be selected. |
required |
Returns:
Type | Description |
---|---|
Self
|
The dataset with the selected time or times. |
Examples:
>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc")
Selecting by datetime
object:
>>> dset.cm.time(datetime(2020, 1, 1))
Selecting by np.datetime64
object:
>>> dset.cm.time(np.datetime64("2020-01-01"))
Selecting by str
object:
>>> dset.cm.time(slice("2020-01-01"))
Selecting by list
of any of the above:
>>> dset.cm.time([datetime(2020, 1, 1), np.datetime64("2020-01-02")])
Selecting by slice
object:
>>> dset.cm.time(slice(datetime(2020, 1, 1), datetime(2020, 1, 2)))
itime(time)
¶
Select time value by index.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
time
|
int, list[int], np.ndarray, or slice
|
Time index or indices to be selected. |
required |
Returns:
Type | Description |
---|---|
Self
|
The dataset with the selected time or times. |
Examples:
>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc")
Selecting by int
object:
>>> dset.cm.itime(0)
Selecting by list
of int
s:
>>> dset.cm.itime([0, 1])
Selecting by slice
object:
>>> dset.cm.itime(slice(0, 2))
sample_uniform(portion=None, number=None, nan='ignore')
¶
Sample the dataset using a uniform distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
portion
|
float
|
Portion of the dataset to be sampled. |
None
|
number
|
int
|
Number of points to be sampled. |
None
|
nan
|
SamplingNaNPolicy | str
|
Policy for handling NaN values. |
'ignore'
|
Notes
At least one of portion
or number
must be provided.
Cannot be provided both at the same time.
Warns:
Type | Description |
---|---|
TooLargeSamplePortionWarning
|
If the portion exceeds 1.0 or number of points exceeds the number of spatial points in the Domain |
Raises:
Type | Description |
---|---|
ValueError
|
If the dataset contains NaN values and |
Examples:
>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc")
>>> sparse_dset = dset.cm.sample_uniform(portion=0.1)
sample_normal(portion=None, number=None, center_point=None, sigma=10.0, nan='ignore')
¶
Sample the dataset using a normal distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
portion
|
float
|
Portion of the dataset to be sampled. |
None
|
number
|
int
|
Number of points to be sampled. |
None
|
center_point
|
tuple[Longitude, Latitude]
|
Center point for the normal distribution. |
None
|
sigma
|
float
|
Standard deviation for the normal distribution. |
10.0
|
nan
|
SamplingNaNPolicy | str
|
Policy for handling NaN values. |
'ignore'
|
Notes
At least one of portion
or number
must be provided.
Cannot be provided both at the same time.
Warns:
Type | Description |
---|---|
TooLargeSamplePortionWarning
|
If the portion exceeds 1.0 or number of points exceeds the number of spatial points in the Domain |
Raises:
Type | Description |
---|---|
ValueError
|
If the dataset contains NaN values and |
Examples:
>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc")
>>> sparse_dset = dset.cm.sample_normal(
... number=1_000,
... center_point=(10.0, 20.0),
... sigma=5.0,
... )
reconstruct(target, *, method, **recon_kwargs)
¶
Reconstruct the dataset to a target domain.
If target domain is sparse, the reconstruction will be sparse
too. If target domain is dense, the reconstruction will be dense
too. The reconstruction will be done using the method specified
in the method
argument.
The method can be one of the following:
Inverse Distance Weightining (idw
),
Ordinary Kriging (ok
).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target
|
Domain
|
The target domain to reconstruct the dataset to. |
required |
method
|
ReconstructionType | str
|
The method to use for reconstruction. Can be one of the following: 'idw', 'ok'. |
required |
recon_kwargs
|
dict
|
Additional keyword arguments to pass to the reconstruction method. |
{}
|
See Also
Returns:
Type | Description |
---|---|
Self
|
The reconstructed dataset. |
plot(title=None, target=None, show=True, **kwargs)
¶
Plot the dataset on a map.
The dataset is plotted using Cartopy and Matplotlib.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
title
|
str
|
Title of the plot. If not provided, the name of the dataset will be used. If the dataset has no name, "Climatrix Dataset" will be used. |
None
|
target
|
str, os.PathLike, Path, or None
|
Path to save the plot. If not provided, the plot will not be saved. |
None
|
show
|
bool
|
Whether to show the plot. Default is True. |
True
|
**kwargs
|
dict
|
Additional keyword arguments to pass to the plotting function.
|
{}
|
Returns:
Type | Description |
---|---|
Axes
|
The axes object containing the plot. |
Raises:
Type | Description |
---|---|
NotImplementedError
|
If the dataset is dynamic (contains time dimension with more than one value). |
transpose(*axes)
¶
Transpose the dataset along the specified dimensions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*axes
|
AxisType or str
|
The axes along which to transpose the dataset. |
()
|
Returns:
Type | Description |
---|---|
Self
|
The transposed dataset. |
Examples:
>>> import climatrix as cm
>>> dset = xr.open_dataset("path/to/dataset.nc").cm
>>> dset2 = dset.transpose("longitude", "latitude")
๐ Domain¶
climatrix.dataset.domain.Domain
¶
Base class for domain objects.
Attributes:
Name | Type | Description |
---|---|---|
is_sparse |
ClassVar[bool]
|
Indicates if the domain is sparse or dense. |
_axes |
dict[AxisType, Axis]
|
Mapping of |
dims
property
¶
Get the dimensions of the dataset.
Returns:
Type | Description |
---|---|
tuple[AxisType, ...]
|
A tuple of |
Notes
The dimensions are determined by the axes that are marked as
dimensions in the domain. E.g. if underlying dataset has
shape (5, 10, 20)
, it means there are 3 dimensional axes.
latitude
property
¶
Latitude axis
longitude
property
¶
Longitude axis
time
property
¶
Time axis
point
property
¶
Point axis
vertical
property
¶
Vertical axis
is_dynamic
property
¶
If the domain is dynamic.
is_sparse
class-attribute
¶
size
property
¶
Domain size.
all_axes_types
property
¶
All axis types in the domain.
from_lat_lon(lat=slice(-90, 90, _DEFAULT_LAT_RESOLUTION), lon=slice(-180, 180, _DEFAULT_LON_RESOLUTION), kind='dense')
classmethod
¶
Create a domain from latitude and longitude coordinates.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
lat
|
slice or ndarray
|
Latitude coordinates. If a slice is provided, it will be converted to a numpy array using the specified step. |
slice(-90, 90, _DEFAULT_LAT_RESOLUTION)
|
lon
|
slice or ndarray
|
Longitude coordinates. If a slice is provided, it will be converted to a numpy array using the specified step. |
slice(-180, 180, _DEFAULT_LON_RESOLUTION)
|
kind
|
str
|
Type of domain to create. Can be either "dense" or "sparse". Default is "dense". |
'dense'
|
Returns:
Type | Description |
---|---|
Domain
|
An instance of the Domain class with the specified latitude and longitude coordinates. |
get_size(axis)
¶
Get the size of the specified axis.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
axis
|
AxisType
|
The axis for which to get the size. |
required |
Returns:
Type | Description |
---|---|
int
|
The size of the specified axis. |
has_axis(axis)
¶
Check if the specified axis exists in the domain.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
axis
|
AxisType
|
The axis type to check. |
required |
Returns:
Type | Description |
---|---|
bool
|
True if the axis exists, False otherwise. |
get_axis(axis)
¶
get_all_spatial_points()
abstractmethod
¶
to_xarray(values, name=None)
abstractmethod
¶
climatrix.dataset.domain.SparseDomain
¶
Bases: Domain
Sparse domain class.
Supports operations on sparse spatial domain.
to_xarray(values, name=None)
¶
Convert domain to sparse xarray.DataArray.
The method applies values
and (optionally) name
to
create a new xarray.DataArray object based on the domain.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
values
|
ndarray
|
The values to be assigned to the DataArray variable. |
required |
name
|
str
|
The name of the DataArray variable. |
None
|
Returns:
Type | Description |
---|---|
DataArray
|
The xarray.DataArray single variable object. |
Raises:
Type | Description |
---|---|
ValueError
|
If the shape of |
Examples:
>>> domain = Domain.from_lat_lon()
>>> values = np.random.rand(5, 5)
>>> da = domain.to_xarray(values, name="example")
>>> isinstance(da, xr.DataArray)
True
>>> da.name
'example'
get_all_spatial_points()
¶
Get all spatial points in the domain.
Returns:
Type | Description |
---|---|
ndarray
|
An array of shape (n_points, 2) containing the latitude and longitude coordinates of all points in the domain. |
Examples:
>>> points = domain.get_all_spatial_points()
>>> points
array([[ 0. , -0.1],
[ 0. , 0. ],
[ 0. , 0.1],
...
climatrix.dataset.domain.DenseDomain
¶
Bases: Domain
Dense domain class.
Supports operations on dense spatial domain.
to_xarray(values, name=None)
¶
Convert domain to dense xarray.DataArray.
The method applies values
and (optionally) name
to
create a new xarray.DataArray object based on the domain.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
values
|
ndarray
|
The values to be assigned to the DataArray variable. |
required |
name
|
str
|
The name of the DataArray variable. |
None
|
Returns:
Type | Description |
---|---|
DataArray
|
The xarray.DataArray single variable object. |
Raises:
Type | Description |
---|---|
ValueError
|
If the shape of |
Examples:
>>> domain = Domain.from_lat_lon()
>>> values = np.random.rand(5, 5)
>>> da = domain.to_xarray(values, name="example")
>>> isinstance(da, xr.DataArray)
True
>>> da.name
'example'
get_all_spatial_points()
¶
Get all spatial points in the domain.
Returns:
Type | Description |
---|---|
ndarray
|
An array of shape (n_points, 2) containing the latitude and longitude coordinates of all points in the domain. |
Examples:
>>> points = domain.get_all_spatial_points()
>>> points
array([[ 0. , -0.1],
[ 0. , 0. ],
[ 0. , 0.1],
...
๐ Reconstructors¶
climatrix.reconstruct.base.BaseReconstructor
¶
Bases: ABC
Base class for all dataset reconstruction methods.
Attributes:
Name | Type | Description |
---|---|---|
dataset |
BaseClimatrixDataset
|
The dataset to be reconstructed. |
target_domain |
Domain
|
The target domain for the reconstruction. |
reconstruct()
abstractmethod
¶
Reconstruct the dataset using the specified method.
This is an abstract method that must be implemented by subclasses.
The data are reconstructed for the target domain, passed in the initializer.
Returns:
Type | Description |
---|---|
BaseClimatrixDataset
|
The reconstructed dataset. |
climatrix.reconstruct.idw.IDWReconstructor
¶
Bases: BaseReconstructor
Inverse Distance Weighting Reconstructor
Attributes:
Name | Type | Description |
---|---|---|
k |
int
|
The number of nearest neighbors to consider. |
k_min |
int
|
The minimum number of nearest neighbors to consider (if k < k_min) NaN values will be put. |
power |
int
|
The power to raise the distance to |
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
BaseClimatrixDataset
|
The input dataset to reconstruct. |
required |
target_domain
|
Domain
|
The target domain for reconstruction. |
required |
power
|
int
|
The power to raise the distance to (default is 2). |
2
|
k
|
int
|
The number of nearest neighbors to consider (default is 5). |
5
|
k_min
|
int
|
The minimum number of nearest neighbors to consider (if k < k_min) NaN values will be put (default is 2). |
2
|
Raises:
Type | Description |
---|---|
NotImplementedError
|
If the input dataset is dynamic, as IDW reconstruction is not yet supported for dynamic datasets. |
ValueError
|
If k_min is greater than k or if k is less than 1. |
reconstruct()
¶
Perform Inverse Distance Weighting (IDW) reconstruction.
This method reconstructs the sparse dataset using IDW, taking into account the specified number of nearest neighbors and the power to which distances are raised. The reconstructed data is returned as a dense dataset, either static or dynamic based on the input dataset.
Returns:
Type | Description |
---|---|
BaseClimatrixDataset
|
The reconstructed dataset on the target domain. |
Notes
- If fewer than
self.k_min
neighbors are available, NaN values are assigned to the corresponding points in the output.
climatrix.reconstruct.kriging.OrdinaryKrigingReconstructor
¶
Bases: BaseReconstructor
Reconstruct a sparse dataset using Ordinary Kriging.
Attributes:
Name | Type | Description |
---|---|---|
dataset |
SparseDataset
|
The sparse dataset to reconstruct. |
domain |
Domain
|
The target domain for reconstruction. |
pykrige_kwargs |
dict
|
Additional keyword arguments to pass to pykrige. |
backend |
Literal['vectorized', 'loop'] | None
|
The backend to use for kriging. |
_MAX_VECTORIZED_SIZE |
ClassVar[int]
|
The maximum size for vectorized kriging.
If the dataset is larger than this size, loop kriging
will be used (if |
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
SparseDataset
|
The sparse dataset to reconstruct. |
required |
target_domain
|
Domain
|
The target domain for reconstruction. |
required |
backend
|
Literal['vectorized', 'loop'] | None
|
The backend to use for kriging (default is None). |
None
|
pykrige_kwargs
|
dict
|
Additional keyword arguments to pass to pykrige. |
{}
|
reconstruct()
¶
Perform Ordinary Kriging reconstruction of the dataset.
Returns:
Type | Description |
---|---|
BaseClimatrixDataset
|
The dataset reconstructed on the target domain. |
Notes
- The backend is chosen based on the size of the dataset. If the dataset is larger than the maximum size, the loop backend is used.
climatrix.reconstruct.siren.siren.SIRENReconstructor
¶
Bases: BaseReconstructor
A reconstructor that uses SIREN to reconstruct fields.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
BaseClimatrixDataset
|
Source dataset to reconstruct from. |
required |
target_domain
|
Domain
|
Target domain to reconstruct onto. |
required |
on_surface_points
|
int
|
Number of points to sample on the surface for training. |
1024
|
hidden_features
|
int
|
Number of features in each hidden layer. |
256
|
hidden_layers
|
int
|
Number of hidden layers in the SIREN model. |
4
|
omega_0
|
float
|
Frequency multiplier for the first layer. |
30.0
|
omega_hidden
|
float
|
Frequency multiplier for hidden layers. |
30.0
|
lr
|
float
|
Learning rate for the optimizer. |
1e-4
|
num_epochs
|
int
|
Number of epochs to train for. |
100
|
num_workers
|
int
|
Number of worker processes for the dataloader. |
0
|
device
|
str
|
Device to run the model on ("cuda" or "cpu"). |
"cuda"
|
gradient_clipping_value
|
float or None
|
Value for gradient clipping (None to disable). |
None
|
checkpoint
|
str or PathLike or Path or None
|
Path to save/load model checkpoint from. |
None
|
sdf_loss_weight
|
float
|
Weight for the SDF constraint loss. |
3000.0
|
inter_loss_weight
|
float
|
Weight for the interpolation consistency loss. |
100.0
|
normal_loss_weight
|
float
|
Weight for the surface normal loss. |
100.0
|
grad_loss_weight
|
float
|
Weight for the gradient regularization loss. |
50.0
|
Raises:
Type | Description |
---|---|
NotImplementedError
|
If trying to use SIREN with a dynamic dataset. |
reconstruct()
¶
Train (if necessary) and use a SIREN model to reconstruct the field.
This method is the main entry point for using the SIREN reconstructor. It will train a new model if no checkpoint was loaded, and then use the model to reconstruct the field on the target domain.
Returns:
Type | Description |
---|---|
BaseClimatrixDataset
|
A dataset containing the reconstructed field. |
Raises:
Type | Description |
---|---|
ImportError
|
If required dependencies are not installed. |
โ๏ธ Evaluation¶
climatrix.comparison.Comparison
¶
Class for comparing two dense datasets.
Attributes:
Name | Type | Description |
---|---|---|
sd |
DenseDataset
|
The source dataset. |
td |
DenseDataset
|
The target dataset. |
diff |
DataArray
|
The difference between the source and target datasets. |
Parameters:
Name | Type | Description | Default |
---|---|---|---|
predicted_dataset
|
DenseDataset
|
The source dataset. |
required |
true_dataset
|
DenseDataset
|
The target dataset. |
required |
map_nan_from_source
|
bool
|
If True, the NaN values from the source dataset will be
mapped to the target dataset. If False, the NaN values
from the target dataset will be used. Default is None,
which means |
None
|
plot_diff(ax=None)
¶
Plot the difference between the source and target datasets.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
ax
|
Axes
|
The matplotlib axes on which to plot the difference. If None, a new set of axes will be created. |
None
|
Returns:
Type | Description |
---|---|
Axes
|
The matplotlib axes containing the plot of the difference. |
plot_signed_diff_hist(ax=None, n_bins=50, limits=None, label=None, alpha=1.0)
¶
Plot the histogram of signed difference between datasets.
The signed difference is a dataset where positive values represent areas where the source dataset is larger than the target dataset and negative values represent areas where the source dataset is smaller than the target dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
ax
|
Axes
|
The matplotlib axes on which to plot the histogram. If None, a new set of axes will be created. |
None
|
n_bins
|
int
|
The number of bins to use in the histogram (default is 50). |
50
|
limits
|
tuple[float]
|
The limits of values to include in the histogram (default is None). |
None
|
Returns:
Type | Description |
---|---|
Axes
|
The matplotlib axes containing the plot of the signed difference. |
compute_rmse()
¶
Compute the RMSE between the source and target datasets.
Returns:
Type | Description |
---|---|
float
|
The RMSE between the source and target datasets. |
compute_mae()
¶
Compute the MAE between the source and target datasets.
Returns:
Type | Description |
---|---|
float
|
The mean absolute error between the source and target datasets. |
compute_r2()
¶
Compute the R^2 between the source and target datasets.
Returns:
Type | Description |
---|---|
float
|
The R^2 between the source and target datasets. |
compute_max_abs_error()
¶
Compute the maximum absolute error between datasets.
Returns:
Type | Description |
---|---|
float
|
The maximum absolute error between the source and target datasets. |
compute_report()
¶
save_report(target_dir)
¶
Save a report of the comparison between passed datasets.
This method will create a directory at the specified path and save a report of the comparison between the source and target datasets in that directory. The report will include plots of the difference and signed difference between the datasets, as well as a csv file with metrics such as the RMSE, MAE, and maximum absolute error.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target_dir
|
str | PathLike | Path
|
The path to the directory where the report should be saved. |
required |