xcdat.open_mfdataset#

xcdat.open_mfdataset(paths, data_var=None, add_bounds=['X', 'Y'], decode_times=True, center_times=False, lon_orient=None, data_vars='minimal', preprocess=None, **kwargs)[source]#

Wraps xarray.open_mfdataset() with post-processing options.

Deprecated since version v0.6.0: add_bounds boolean arguments (True/False) are being deprecated. Please use either a list (e.g., [“X”, “Y”]) to specify axes or None.

Parameters:
  • paths (str | NestedSequence[str | os.PathLike]) – Paths to dataset files. Paths can be given as strings or as pathlib.Path objects. Supported options include:

    • Directory path (e.g., "path/to/files"), which is converted to a string glob of *.nc files

    • String glob (e.g., "path/to/files/*.nc"), which is expanded to a 1-dimensional list of file paths

    • File path to dataset (e.g., "path/to/files/file1.nc")

    • List of file paths (e.g., ["path/to/files/file1.nc", ...]). If concatenation along more than one dimension is desired, then paths must be a nested list-of-lists (see [2] xarray.combine_nested for details).

    • File path to an XML file with a directory attribute (e.g., "path/to/files"). If directory is set to a blank string (“”), then the current directory is substituted (“.”). This option is intended to support the CDAT CDML dialect of XML files, but it can work with any XML file that has the directory attribute. Refer to [4] for more information on CDML. NOTE: This feature is deprecated in v0.6.0 and will be removed in the subsequent release. CDAT (including cdms2/CDML) is in maintenance only mode and marked for end-of-life by the end of 2023.

  • add_bounds (List[CFAxisKey] | None | bool) – List of CF axes to try to add bounds for (if missing), by default [“X”, “Y”]. Set to None to not add any missing bounds. Please note that bounds are required for many xCDAT features.

  • data_var (Optional[str], optional) – The key of the data variable to keep in the Dataset, by default None.

  • decode_times (bool, optional) – If True, attempt to decode times encoded in the standard NetCDF datetime format into cftime.datetime objects. Otherwise, leave them encoded as numbers. This keyword may not be supported by all the backends, by default True.

  • center_times (bool, optional) – If True, attempt to center time coordinates using the midpoint between its upper and lower bounds. Otherwise, use the provided time coordinates, by default False.

  • lon_orient (Optional[Tuple[float, float]], optional) – The orientation to use for the Dataset’s longitude axis (if it exists), by default None. Supported options include:

    • None: use the current orientation (if the longitude axis exists)

    • (-180, 180): represents [-180, 180) in math notation

    • (0, 360): represents [0, 360) in math notation

  • data_vars ({"minimal", "different", "all" or list of str}, optional) –

    These data variables will be concatenated together:
    • “minimal”: Only data variables in which the dimension already appears are included, the default value.

    • “different”: Data variables which are not equal (ignoring attributes) across all datasets are also concatenated (as well as all for which dimension already appears). Beware: this option may load the data payload of data variables into memory if they are not already loaded.

    • “all”: All data variables will be concatenated.

    • list of str: The listed data variables will be concatenated, in addition to the “minimal” data variables.

    The data_vars kwarg defaults to "minimal", which concatenates data variables in a manner where only data variables in which the dimension already appears are included. For example, the time dimension will not be concatenated to the dimensions of non-time data variables such as “lat_bnds” or “lon_bnds”. data_vars=”minimal” is required for some xCDAT functions, including spatial averaging where a reduction is performed using the lat/lon bounds.

  • preprocess (Optional[Callable], optional) – If provided, call this function on each dataset prior to concatenation. You can find the file-name from which each dataset was loaded in ds.encoding["source"].

  • kwargs (Dict[str, Any]) – Additional arguments passed on to xarray.open_mfdataset. Refer to the [3] xarray docs for accepted keyword arguments.

Returns:

xr.Dataset – The Dataset.

Notes

xarray.open_mfdataset opens the file with read-only access. When you modify values of a Dataset, even one linked to files on disk, only the in-memory copy you are manipulating in xarray is modified: the original file on disk is never touched.

The CDAT “Climate Data Markup Language” (CDML) is a deprecated dialect of XML with a defined set of attributes. CDML is still used by current and former users of CDAT. To enable CDML users to adopt xCDAT more easily in their workflows, xCDAT can parse XML/CDML files for the directory to generate a glob or list of file paths. Refer to [4] for more information on CDML. NOTE: This feature is deprecated in v0.6.0 and will be removed in the subsequent release. CDAT (including cdms2/CDML) is in maintenance only mode and marked for end-of-life by the end of 2023.

References