Adds missing coordinate bounds for supported axes in the Dataset.
This function loops through the Dataset’s axes and attempts to adds
bounds to its coordinates if they don’t exist. “X”, “Y” , and “Z” axes
bounds are the midpoints between coordinates. “T” axis bounds are based
on the time frequency of the coordinates.
An axis must meet the following criteria to add bounds for it, otherwise
they are ignored:
Axis is either X”, “Y”, “T”, or “Z”
Coordinates are a single dimension, not multidimensional
Coordinates are a length > 1 (not singleton)
Bounds must not already exist
Coordinates are mapped to bounds using the “bounds” attr. For
example, bounds exist if ds.time.attrs["bounds"] is set to
"time_bnds" and ds.time_bnds is present in the dataset.
For the “T” axis, its coordinates must be composed of datetime-like
objects (np.datetime64 or cftime). This method designed to
operate on time axes that have constant temporal resolution with
annual, monthly, daily, or sub-daily time frequencies. Alternate
frequencies (e.g., pentad) are not supported.
Parameters:
axes (List[CFAxesKey]|Tuple[CFAxisKey, ]) – List of CF axes that function should operate on, by default
(“X”, “Y”, “T”). Options include “X”, “Y”, “T”, or “Z”.
var_key (Optional[str]) – The key of the coordinate or data variable to get axis bounds for.
This parameter is useful if you only want the single bounds
DataArray related to the axis on the variable (e.g., “tas” has
a “lat” dimension and you want “lat_bnds”).
Returns:
Union[xr.Dataset, xr.DataArray] – A Dataset of N bounds variables, or a single bounds variable
DataArray.
Raises:
ValueError – If an incorrect axis argument is passed.
KeyError: – If bounds were not found for the specific axis.
Add bounds for an axis using its coordinates as midpoints.
This method loops over the axis’s coordinate variables and attempts to
add bounds for each of them if they don’t exist. Each coordinate point
is the midpoint between their lower and upper bounds.
To add bounds for an axis its coordinates must meet the following
criteria, otherwise an error is thrown:
Axis is either X”, “Y”, “T”, or “Z”
Coordinates are single dimensional, not multidimensional
Coordinates are a length > 1 (not singleton)
Bounds must not already exist
Coordinates are mapped to bounds using the “bounds” attr. For
example, bounds exist if ds.time.attrs["bounds"] is set to
"time_bnds" and ds.time_bnds is present in the dataset.
Add bounds for an axis using its coordinate points.
This method designed to operate on time axes that have constant temporal
resolution with annual, monthly, daily, or sub-daily time frequencies.
Alternate frequencies (e.g., pentad) are not supported. It loops over
the time axis coordinate variables and attempts to add bounds for each
of them if they don’t exist.
To add time bounds for the time axis, its coordinates must be the
following criteria:
Coordinates are single dimensional, not multidimensional
Coordinates are a length > 1 (not singleton)
Bounds must not already exist
Coordinates are mapped to bounds using the “bounds” attr. For
example, bounds exist if ds.time.attrs["bounds"] is set to
"time_bnds" and ds.time_bnds is present in the dataset.
If method=freq, coordinates must be composed of datetime-like
objects (np.datetime64 or cftime)
Parameters:
method ({"freq","midpoint"}) – The method for creating time bounds for time coordinates, either
“freq” or “midpoint”.
“freq”: Create time bounds as the start and end of each timestep’s
period using either the inferred or specified time frequency
(freq parameter). For example, the time bounds will be the
start and end of each month for each monthly coordinate point.
“midpoint”: Create time bounds using time coordinates as the
midpoint between their upper and lower bounds.
freq ({"year","month","day","hour"}, optional) – If method="freq", this parameter specifies the time frequency
for creating time bounds. By default None, which infers the
frequency using the time coordinates.
daily_subfreq ({1,2,3,4,6,8,12,24}, optional) – If freq=="hour", this parameter sets the number of timepoints
per day for time bounds, by default None.
daily_subfreq=None infers the daily time frequency from the
time coordinates.
daily_subfreq=1 is daily
daily_subfreq=2 is twice daily
daily_subfreq=4 is 6-hourly
daily_subfreq=8 is 3-hourly
daily_subfreq=12 is 2-hourly
daily_subfreq=24 is hourly
end_of_month (bool, optional) – If freq=="month", this flag notes that the timepoint is saved
at the end of the monthly interval (see Note), by default False.
Some timepoints are saved at the end of the interval, e.g., Feb. 1
00:00 for the time interval Jan. 1 00:00 - Feb. 1 00:00. Since this
method determines the month and year from the time vector, the
bounds will be set incorrectly if the timepoint is set to the end of
the time interval. For these cases, set end_of_month=True.
Drop ancillary singleton coordinates from dimension coordinates.
Xarray coordinate variables retain all coordinates from the parent
object. This means if singleton coordinates exist, they are attached to
dimension coordinates as ancillary coordinates. For example, the
“height” singleton coordinate will be attached to “time” coordinates
even though “height” is related to the “Z” axis, not the “T” axis.
Refer to [1] for more info on this Xarray behavior.
This is an undesirable behavior in xCDAT because the add bounds methods
loop over coordinates related to an axis and attempts to add bounds if
they don’t exist. If ancillary coordinates are present, “ValueError:
Cannot generate bounds for coordinate variable ‘height’ which has a
length <= 1 (singleton)” is raised. For the purpose of adding bounds, we
temporarily drop any ancillary singletons from dimension coordinates
before looping over those coordinates. Ancillary singletons will still
be present in the final Dataset object to maintain the Dataset’s
integrity.
Parameters:
coord_vars (Union[xr.Dataset, xr.DataArray]) – The dimension coordinate variables with ancillary coordinates (if
they exist).
Returns:
Union[xr.Dataset, xr.DataArray] – The dimension coordinate variables with ancillary coordinates
dropped (if they exist).
Creates time bounds for each timestep of the time coordinate axis.
This method creates time bounds as the start and end of each timestep’s
period using either the inferred or specified time frequency (freq
parameter). For example, the time bounds will be the start and end of
each month for each monthly coordinate point.
Parameters:
time (xr.DataArray) – The temporal coordinate variable for the axis.
freq ({"year","month","day","hour"}, optional) – The time frequency for creating time bounds, by default None (infer
the frequency).
daily_subfreq ({1,2,3,4,6,8,12,24}, optional) – If freq=="hour", this parameter sets the number of timepoints
per day for bounds, by default None. If greater than 1, sub-daily
bounds are created.
daily_subfreq=None infers the freq from the time coords (default)
daily_subfreq=1 is daily
daily_subfreq=2 is twice daily
daily_subfreq=4 is 6-hourly
daily_subfreq=8 is 3-hourly
daily_subfreq=12 is 2-hourly
daily_subfreq=24 is hourly
end_of_month (bool, optional) – If freq==”month”`, this flag notes that the timepoint is saved
at the end of the monthly interval (see Note), by default False.
Returns:
xr.DataArray – A DataArray storing bounds for the time axis.
Raises:
ValueError – If coordinates are a singleton.
TypeError – If time coordinates are not composed of datetime-like objects.
Note
Some timepoints are saved at the end of the interval, e.g., Feb. 1 00:00
for the time interval Jan. 1 00:00 - Feb. 1 00:00. Since this function
determines the month and year from the time vector, the bounds will be set
incorrectly if the timepoint is set to the end of the time interval. For
these cases, set end_of_month=True.
Creates time bounds for each timestep with the start and end of the year.
Bounds for each timestep correspond to Jan. 1 00:00:00 of the year of the
timestep and Jan. 1 00:00:00 of the subsequent year.
Parameters:
timesteps (np.ndarray) – An array of timesteps, represented as either cftime.datetime or
pd.Timestamp (casted from np.datetime64[ns] to support pandas
time/date components).
obj_type (Union[cftime.datetime, pd.Timestamp]) – The object type for time bounds based on the dtype of
time_values.
Returns:
List[Union[cftime.datetime, pd.Timestamp]] – A list of time bound values.
Creates time bounds for each timestep with the start and end of the month.
Bounds for each timestep correspond to 00:00:00 on the first of the month
and 00:00:00 on the first of the subsequent month.
Parameters:
timesteps (np.ndarray) – An array of timesteps, represented as either cftime.datetime or
pd.Timestamp (casted from np.datetime64[ns] to support pandas
time/date components).
obj_type (Union[cftime.datetime, pd.Timestamp]) – The object type for time bounds based on the dtype of
time_values.
end_of_month (bool, optional) – Flag to note that the timepoint is saved at the end of the monthly
interval (see Note), by default False.
Returns:
List[Union[cftime.datetime, pd.Timestamp]] – A list of time bound values.
Note
Some timepoints are saved at the end of the interval, e.g., Feb. 1 00:00
for the time interval Jan. 1 00:00 - Feb. 1 00:00. Since this function
determines the month and year from the time vector, the bounds will be set
incorrectly if the timepoint is set to the end of the time interval. For
these cases, set end_of_month=True.
Creates time bounds for each timestep with the start and end of the day.
Bounds for each timestep corresponds to 00:00:00 timepoint on the
current day and 00:00:00 on the subsequent day.
If time steps are sub-daily, then the bounds will begin at 00:00 and end
at 00:00 of the following day. For example, for 3-hourly data, the
bounds would be:
timesteps (np.ndarray) – An array of timesteps, represented as either cftime.datetime or
pd.Timestamp (casted from np.datetime64[ns] to support pandas
time/date components).
obj_type (Union[cftime.datetime, pd.Timestamp]) – The object type for time bounds based on the dtype of
time_values.
freq ({1,2,3,4,6,8,12,24}, optional) – Number of timepoints per day, by default 1. If greater than 1, sub-daily
bounds are created.
freq=1 is daily (default)
freq=2 is twice daily
freq=4 is 6-hourly
freq=8 is 3-hourly
freq=12 is 2-hourly
freq=24 is hourly
Returns:
List[Union[cftime.datetime, pd.Timestamp]] – A list of time bound values.
Raises:
ValueError – If an incorrect freq argument is passed. Should be 1, 2, 3, 4, 6, 8,
12, or 24.
Notes
This function is intended to reproduce CDAT’s setAxisTimeBoundsDaily
method [5].