xcdat.bounds.BoundsAccessor#
- class xcdat.bounds.BoundsAccessor(dataset)[source]#
An accessor class that provides bounds attributes and methods on xarray Datasets through the
.bounds
attribute.Examples
Import BoundsAccessor class:
>>> import xcdat # or from xcdat import bounds
Use BoundsAccessor class:
>>> ds = xcdat.open_dataset("/path/to/file") >>> >>> ds.bounds.<attribute> >>> ds.bounds.<method> >>> ds.bounds.<property>
- Parameters
dataset (
xr.Dataset
) – A Dataset object.
Examples
Import:
>>> from xcdat import bounds
Return dictionary of axis and coordinate keys mapped to bounds:
>>> ds.bounds.map
Return list of keys for bounds data variables:
>>> ds.bounds.keys
Add missing coordinate bounds for supported axes in the Dataset:
>>> ds = ds.bounds.add_missing_bounds(axes=["X", "Y", "T"])
Get coordinate bounds if they exist:
>>> lat_bounds = ds.bounds.get_bounds("Y") >>> lon_bounds = ds.bounds.get_bounds("X") >>> time_bounds = ds.bounds.get_bounds("T")
Add coordinate bounds for a specific axis if they don’t exist:
>>> ds = ds.bounds.add_bounds("Y")
Methods
__init__
(dataset)add_bounds
(axis)Add bounds for an axis using its coordinates as midpoints.
add_missing_bounds
(axes)Adds missing coordinate bounds for supported axes in the Dataset.
add_time_bounds
(method[, freq, ...])Add bounds for an axis using its coordinate points.
get_bounds
(axis[, var_key])Gets coordinate bounds.
Attributes
Returns a list of keys for the bounds data variables in the Dataset.
Returns a map of axis and coordinates keys to their bounds.
- _dataset#
- property map#
Returns a map of axis and coordinates keys to their bounds.
The dictionary provides all valid CF compliant keys for axis and coordinates. For example, latitude will includes keys for “lat”, “latitude”, and “Y”.
- Returns
Dict[str
,Optional[xr.DataArray]]
– Dictionary mapping axis and coordinate keys to their bounds.
- property keys#
Returns a list of keys for the bounds data variables in the Dataset.
- Returns
List[str]
– A list of sorted bounds data variable keys.
- add_missing_bounds(axes)[source]#
Adds missing coordinate bounds for supported axes in the Dataset.
This function loops through the Dataset’s axes and attempts to adds bounds to its coordinates if they don’t exist. “X”, “Y” , and “Z” axes bounds are the midpoints between coordinates. “T” axis bounds are based on the time frequency of the coordinates.
An axis must meet the following criteria to add bounds for it, otherwise they are ignored:
Axis is either X”, “Y”, “T”, or “Z”
Coordinates are a single dimension, not multidimensional
Coordinates are a length > 1 (not singleton)
Bounds must not already exist
Coordinates are mapped to bounds using the “bounds” attr. For example, bounds exist if
ds.time.attrs["bounds"]
is set to"time_bnds"
andds.time_bnds
is present in the dataset.
For the “T” axis, its coordinates must be composed of datetime-like objects (np.datetime64 or cftime).
- Parameters
axes (
List[str]
) – List of CF axes that function should operate on. Options include “X”, “Y”, “T”, or “Z”.- Returns
xr.Dataset
- get_bounds(axis, var_key=None)[source]#
Gets coordinate bounds.
- Parameters
axis (
CFAxisKey
) – The CF axis key (“X”, “Y”, “T”, “Z”).var_key (
Optional[str]
) – The key of the coordinate or data variable to get axis bounds for. This parameter is useful if you only want the single bounds DataArray related to the axis on the variable (e.g., “tas” has a “lat” dimension and you want “lat_bnds”).
- Returns
Union[xr.Dataset
,xr.DataArray]
– A Dataset of N bounds variables, or a single bounds variable DataArray.- Raises
ValueError – If an incorrect
axis
argument is passed.KeyError: – If bounds were not found for the specific
axis
.
- add_bounds(axis)[source]#
Add bounds for an axis using its coordinates as midpoints.
This method loops over the axis’s coordinate variables and attempts to add bounds for each of them if they don’t exist. Each coordinate point is the midpoint between their lower and upper bounds.
To add bounds for an axis its coordinates must meet the following criteria, otherwise an error is thrown:
Axis is either X”, “Y”, “T”, or “Z”
Coordinates are single dimensional, not multidimensional
Coordinates are a length > 1 (not singleton)
Bounds must not already exist
Coordinates are mapped to bounds using the “bounds” attr. For example, bounds exist if
ds.time.attrs["bounds"]
is set to"time_bnds"
andds.time_bnds
is present in the dataset.
- Parameters
axis (
CFAxisKey
) – The CF axis key (“X”, “Y”, “T”, “Z”).- Returns
xr.Dataset
– The dataset with bounds added.Raises
- add_time_bounds(method, freq=None, daily_subfreq=None, end_of_month=False)[source]#
Add bounds for an axis using its coordinate points.
This method loops over the time axis coordinate variables and attempts to add bounds for each of them if they don’t exist. To add time bounds for the time axis, its coordinates must be the following criteria:
Coordinates are single dimensional, not multidimensional
Coordinates are a length > 1 (not singleton)
Bounds must not already exist
Coordinates are mapped to bounds using the “bounds” attr. For example, bounds exist if
ds.time.attrs["bounds"]
is set to"time_bnds"
andds.time_bnds
is present in the dataset.
If
method=freq
, coordinates must be composed of datetime-like objects (np.datetime64
orcftime
)
- Parameters
method (
{"freq", "midpoint"}
) – The method for creating time bounds for time coordinates, either “freq” or “midpoint”.“freq”: Create time bounds as the start and end of each timestep’s period using either the inferred or specified time frequency (
freq
parameter). For example, the time bounds will be the start and end of each month for each monthly coordinate point.“midpoint”: Create time bounds using time coordinates as the midpoint between their upper and lower bounds.
freq (
{"year", "month", "day", "hour"}
, optional) – Ifmethod="freq"
, this parameter specifies the time frequency for creating time bounds. By default None, which infers the frequency using the time coordinates.daily_subfreq (
{1, 2, 3, 4, 6, 8, 12, 24}
, optional) – Iffreq=="hour"
, this parameter sets the number of timepoints per day for time bounds, by default None.daily_subfreq=None
infers the daily time frequency from the time coordinates.daily_subfreq=1
is dailydaily_subfreq=2
is twice dailydaily_subfreq=4
is 6-hourlydaily_subfreq=8
is 3-hourlydaily_subfreq=12
is 2-hourlydaily_subfreq=24
is hourly
end_of_month (
bool
, optional) – Iffreq=="month"
, this flag notes that the timepoint is saved at the end of the monthly interval (see Note), by default False.Some timepoints are saved at the end of the interval, e.g., Feb. 1 00:00 for the time interval Jan. 1 00:00 - Feb. 1 00:00. Since this method determines the month and year from the time vector, the bounds will be set incorrectly if the timepoint is set to the end of the time interval. For these cases, set
end_of_month=True
.
- Returns
xr.Dataset
– The dataset with time bounds added.
- _drop_ancillary_singleton_coords(coord_vars)[source]#
Drop ancillary singleton coordinates from dimension coordinates.
Xarray coordinate variables retain all coordinates from the parent object. This means if singleton coordinates exist, they are attached to dimension coordinates as ancillary coordinates. For example, the “height” singleton coordinate will be attached to “time” coordinates even though “height” is related to the “Z” axis, not the “T” axis. Refer to 1 for more info on this Xarray behavior.
This is an undesirable behavior in xCDAT because the add bounds methods loop over coordinates related to an axis and attempts to add bounds if they don’t exist. If ancillary coordinates are present, “ValueError: Cannot generate bounds for coordinate variable ‘height’ which has a length <= 1 (singleton)” is raised. For the purpose of adding bounds, we temporarily drop any ancillary singletons from dimension coordinates before looping over those coordinates. Ancillary singletons will still be present in the final Dataset object to maintain the Dataset’s integrity.
- Parameters
coord_vars (
Union[xr.Dataset
,xr.DataArray]
) – The dimension coordinate variables with ancillary coordinates (if they exist).- Returns
Union[xr.Dataset
,xr.DataArray]
– The dimension coordinate variables with ancillary coordinates dropped (if they exist).
References
- _get_bounds_keys(axis)[source]#
Get bounds keys for an axis’s coordinate variables in the dataset.
This function attempts to map bounds to an axis using
cf_xarray
and its interpretation of the CF “bounds” attribute.- Parameters
axis (
CFAxisKey
) – The CF axis key (“X”, “Y”, “T”, or “Z”).- Returns
List[str]
– The axis bounds key(s).
- _create_time_bounds(time, freq=None, daily_subfreq=None, end_of_month=False)[source]#
Creates time bounds for each timestep of the time coordinate axis.
This method creates time bounds as the start and end of each timestep’s period using either the inferred or specified time frequency (
freq
parameter). For example, the time bounds will be the start and end of each month for each monthly coordinate point.- Parameters
time (
xr.DataArray
) – The temporal coordinate variable for the axis.freq (
{"year", "month", "day", "hour"}
, optional) – The time frequency for creating time bounds, by default None (infer the frequency).daily_subfreq (
{1, 2, 3, 4, 6, 8, 12, 24}
, optional) – Iffreq=="hour"
, this parameter sets the number of timepoints per day for bounds, by default None. If greater than 1, sub-daily bounds are created.daily_subfreq=None
infers the freq from the time coords (default)daily_subfreq=1
is dailydaily_subfreq=2
is twice dailydaily_subfreq=4
is 6-hourlydaily_subfreq=8
is 3-hourlydaily_subfreq=12
is 2-hourlydaily_subfreq=24
is hourly
end_of_month (
bool
, optional) – If freq==”month”`, this flag notes that the timepoint is saved at the end of the monthly interval (see Note), by default False.
- Returns
xr.DataArray
– A DataArray storing bounds for the time axis.- Raises
ValueError – If coordinates are a singleton.
TypeError – If time coordinates are not composed of datetime-like objects.
Note
Some timepoints are saved at the end of the interval, e.g., Feb. 1 00:00 for the time interval Jan. 1 00:00 - Feb. 1 00:00. Since this function determines the month and year from the time vector, the bounds will be set incorrectly if the timepoint is set to the end of the time interval. For these cases, set
end_of_month=True
.
- _create_yearly_time_bounds(timesteps, obj_type)[source]#
Creates time bounds for each timestep with the start and end of the year.
Bounds for each timestep correspond to Jan. 1 00:00:00 of the year of the timestep and Jan. 1 00:00:00 of the subsequent year.
- Parameters
timesteps (
np.ndarray
) – An array of timesteps, represented as either cftime.datetime or pd.Timestamp (casted from np.datetime64[ns] to support pandas time/date components).obj_type (
Union[cftime.datetime
,pd.Timestamp]
) – The object type for time bounds based on the dtype oftime_values
.
- Returns
List[Union[cftime.datetime
,pd.Timestamp]]
– A list of time bound values.
- _create_monthly_time_bounds(timesteps, obj_type, end_of_month=False)[source]#
Creates time bounds for each timestep with the start and end of the month.
Bounds for each timestep correspond to 00:00:00 on the first of the month and 00:00:00 on the first of the subsequent month.
- Parameters
timesteps (
np.ndarray
) – An array of timesteps, represented as either cftime.datetime or pd.Timestamp (casted from np.datetime64[ns] to support pandas time/date components).obj_type (
Union[cftime.datetime
,pd.Timestamp]
) – The object type for time bounds based on the dtype oftime_values
.end_of_month (
bool
, optional) – Flag to note that the timepoint is saved at the end of the monthly interval (see Note), by default False.
- Returns
List[Union[cftime.datetime
,pd.Timestamp]]
– A list of time bound values.
Note
Some timepoints are saved at the end of the interval, e.g., Feb. 1 00:00 for the time interval Jan. 1 00:00 - Feb. 1 00:00. Since this function determines the month and year from the time vector, the bounds will be set incorrectly if the timepoint is set to the end of the time interval. For these cases, set
end_of_month=True
.
- _add_months_to_timestep(timestep, obj_type, delta)[source]#
Adds delta month(s) to a timestep.
The delta value can be positive or negative (for subtraction). Refer to 4 for logic.
- Parameters
timestep (
Union[cftime.datime
,pd.Timestamp]
) – A timestep represented ascftime.datetime
orpd.Timestamp
.obj_type (
Union[cftime.datetime
,pd.Timestamp]
) – The object type for time bounds based on the dtype oftimestep
.delta (
int
) – Integer months to be added to times (can be positive or negative)
- Returns
Union[cftime.datetime
,pd.Timestamp]
References
- _create_daily_time_bounds(timesteps, obj_type, freq=1)[source]#
Creates time bounds for each timestep with the start and end of the day.
Bounds for each timestep corresponds to 00:00:00 timepoint on the current day and 00:00:00 on the subsequent day.
If time steps are sub-daily, then the bounds will begin at 00:00 and end at 00:00 of the following day. For example, for 3-hourly data, the bounds would be:
[ ["01/01/2000 00:00", "01/01/2000 03:00"], ["01/01/2000 03:00", "01/01/2000 06:00"], ... ["01/01/2000 21:00", "02/01/2000 00:00"], ]
- Parameters
timesteps (
np.ndarray
) – An array of timesteps, represented as either cftime.datetime or pd.Timestamp (casted from np.datetime64[ns] to support pandas time/date components).obj_type (
Union[cftime.datetime
,pd.Timestamp]
) – The object type for time bounds based on the dtype oftime_values
.freq (
{1, 2, 3, 4, 6, 8, 12, 24}
, optional) – Number of timepoints per day, by default 1. If greater than 1, sub-daily bounds are created.freq=1
is daily (default)freq=2
is twice dailyfreq=4
is 6-hourlyfreq=8
is 3-hourlyfreq=12
is 2-hourlyfreq=24
is hourly
- Returns
List[Union[cftime.datetime
,pd.Timestamp]]
– A list of time bound values.- Raises
ValueError – If an incorrect
freq
argument is passed. Should be 1, 2, 3, 4, 6, 8, 12, or 24.
Notes
This function is intended to reproduce CDAT’s
setAxisTimeBoundsDaily
method 5.References