Frequently Asked Questions#
What types of datasets does
xcdat primarily focus on?#
xcdat supports datasets with structured grids that follow the CF convention, but
will also strive to support datasets with common non-CF compliant metadata (e.g., time
units in “months since …” or “years since …”).
What structured grids does
xCDAT aims to be a generalizable package that is compatible with structured grids that are CF-compliant (e.g., CMIP6). xCDAT’s horizontal regridder supports grids that are supported by Regrid2 and xESMF (curvilinear and rectilinear).
xcdat interpret dataset metadata?#
xcdat leverages cf_xarray to interpret CF attributes on
xcdat methods and functions usually accept an
axis argument (e.g.,
ds.temporal.average(data_var="ts", axis="T")). This argument is internally mapped to
cf_xarray mapping tables that interpret the CF attributes.
What CF attributes are interpreted using
cf_xarray mapping tables?#
Axis names – used to map to dimension coordinates
For example, any
axis: "X"in its attrs will be identified as the “latitude” coordinate variable by
Refer to the
cf_xarrayAxis Names table for more information.
Coordinate names – used to map to dimension coordinates
For example, any
"units": "degrees_north"in its attrs will be identified as the “latitude” coordinate variable by
Refer to the
cf_xarrayCoordinate Names table for more information.
Bounds attribute – used to map to bounds data variables
For example, the
latitudecoordinate variable has
bounds: "lat_bnds", which maps its bounds to the
cf_xarrayBounds Variables page for more information.
How are bounds generated in xCDAT?#
xCDAT generates bounds by using coordinate points as the midpoint between their lower and upper bounds.
Does xCDAT support generating bounds for multiple axis coordinate systems in the same dataset?#
For example, there are two sets of coordinates called “lat” and “latitude” in the dataset.
Yes, xCDAT can generate bounds for axis coordinates if they are “dimension coordinates” (coordinate variables in CF terminology) and have the required CF metadata. “Non-dimension coordinates” (auxiliary coordinate variables in CF terminology) are ignored.
Visit Xarray’s documentation page on Coordinates for more info on “dimension coordinates” vs. “non-dimension coordinates”.
What type of time units are supported?#
The units attribute must be in the CF compliant format
"<units> since <reference_date>". For example,
"days since 1990-01-01".
Supported CF compliant units include
which is inherited from
cftime. Supported non-CF compliant units
xcdat is able to parse. Note, the plural form of
these units are accepted.
What type of calendars are supported?#
xcdat supports that same CF convention calendars as
xarray (based on
Supported calendars include:
xcdat decode time coordinates as
cftime objects instead of
One unfortunate limitation of using
datetime64[ns] is that it limits the native
representation of dates to those that fall between the years 1678 and 2262. This affects
climate modeling datasets that have time coordinates outside of this range.
As a workaround,
xarray uses the
cftime library when decoding/encoding
datetimes for non-standard calendars or for dates before year 1678 or after year 2262.
xcdat opted to decode time coordinates exclusively with
cftime because it
has no timestamp range limitations, simplifies implementation, and the output object
type is deterministic.
xcdat aims to implement generalized functionality. This means that functionality
intended to handle data quality issues is out of scope, especially for limited cases.
If data quality issues are present,
xcdat might not be able to open
the datasets. Examples of data quality issues include conflicting floating point values
between files or non-CF compliant attributes that are not common.
A few workarounds include:
open_mfdataset()keyword arguments based on your needs.
Writing a custom
preprocess()function to feed into
open_mfdataset(). This function preprocesses each dataset file individually before joining them into a single Dataset object.
How do I open a multi-file dataset with values that conflict?#
xarray, the default setting for checking compatibility across a multi-file dataset
compat='no_conflicts'. If conflicting values exists between files, xarray raises
MergeError: conflicting values for variable <VARIABLE NAME> on objects to be combined.
You can skip this check by specifying compat="override".
If you still intend on working with these datasets and recognize the source of the issue (e.g., minor floating point diffs), follow the instructions below. Please understand the potential implications before proceeding!
>>> xcdat.open_mfdataset("path/to/files/*.nc", compat="override", join="override")
compat="override": skip comparing and pick variable from first dataset
join="override": if indexes are of same size, rewrite indexes to be those of the first object with that dimension. Indexes for the same dimension must have the same size in all objects.
For more information, visit this page: https://xarray.pydata.org/en/stable/generated/xarray.open_mfdataset.html#xarray-open-mfdataset