xcdat.open_mfdataset

Contents

xcdat.open_mfdataset#

xcdat.open_mfdataset(paths, data_var=None, add_bounds=('X', 'Y'), decode_times=True, center_times=False, lon_orient=None, data_vars='minimal', preprocess=None, *, compat='no_conflicts', join='outer', **kwargs)[source]#

Wraps xarray.open_mfdataset() with post-processing options.

Parameters:
  • paths (str | NestedSequence[str | os.PathLike]) – Paths to dataset files. Paths can be given as strings or as pathlib.Path objects. Supported options include:

    • Directory path (e.g., "path/to/files"), which is converted to a string glob of *.nc files

    • String glob (e.g., "path/to/files/*.nc"), which is expanded to a 1-dimensional list of file paths

    • File path to dataset (e.g., "path/to/files/file1.nc")

    • List of file paths (e.g., ["path/to/files/file1.nc", ...]). If concatenation along more than one dimension is desired, then paths must be a nested list-of-lists (see [2] xarray.combine_nested for details).

  • add_bounds (list[CFAxisKey] | tuple[CFAxisKey, ] | None) – List of CF axes to try to add bounds for (if missing), by default (“X”, “Y”). Set to None to not add any missing bounds. Please note that bounds are required for many xCDAT features.

  • data_var (str | None, optional) – The key of the data variable to keep in the Dataset, by default None.

  • decode_times (bool, optional) – If True, attempt to decode times encoded in the standard NetCDF datetime format into cftime.datetime objects. Otherwise, leave them encoded as numbers. This keyword may not be supported by all the backends, by default True.

  • center_times (bool, optional) – If True, attempt to center time coordinates using the midpoint between its upper and lower bounds. Otherwise, use the provided time coordinates, by default False.

  • lon_orient (tuple[float, float] | None, optional) – The orientation to use for the Dataset’s longitude axis (if it exists), by default None. Supported options include:

    • None: use the current orientation (if the longitude axis exists)

    • (-180, 180): represents [-180, 180) in math notation

    • (0, 360): represents [0, 360) in math notation

  • data_vars ({"minimal", "different", "all" or list of str}, optional) –

    These data variables will be concatenated together:

    • “minimal” (default): Only data variables in which the dimension already appears are included.

    • “different”: Data variables which are not equal (ignoring attributes) across all datasets are also concatenated (as well as all for which dimension already appears). Beware: this option may load the data payload of data variables into memory if they are not already loaded.

    • “all”: All data variables will be concatenated.

    • list of str: The listed data variables will be concatenated, in addition to the “minimal” data variables.

    The data_vars kwarg defaults to "minimal", which concatenates data variables in a manner where only data variables in which the dimension already appears are included. For example, the time dimension will not be concatenated to the dimensions of non-time data variables such as “lat_bnds” or “lon_bnds”. data_vars="minimal" is required for some xCDAT functions, including spatial averaging where a reduction is performed using the lat/lon bounds.

  • preprocess (Callable | None, optional) – If provided, call this function on each dataset prior to concatenation. You can find the file-name from which each dataset was loaded in ds.encoding["source"].

  • compat ({"no_conflicts", "broadcast_equals", "override", "equals", "identical"}, optional) – String indicating how to compare variables of the same name for potential conflicts when merging. Defaults to "no_conflicts" to preserve legacy Xarray behavior ("override" is the new Xarray default as of v2025.08.0). Options include:

    • “no_conflicts” (default): only values which are not null in both datasets must be equal. The returned dataset then contains the combination of all non-null values

    • “broadcast_equals”: all values must be equal when variables are broadcast against each other to ensure common dimensions

    • “equals”: all values and dimensions must be the same

    • “identical”: all values, dimensions and attributes must be the same

    • “override”: skip comparing and pick variable from first dataset. This is the new Xarray default behavior.

  • join ({"outer", "exact", "left", "right", "inner", "override"}, optional) – String indicating how to combine differing indexes (excluding concat_dim) in objects. Defaults to "outer" to preserve legacy Xarray behavior ("exact" is the new Xarray default as of v2025.08.0). Options include:

    • “outer” (default): use the union of object indexes

    • “inner”: use the intersection of object indexes

    • “left”: use indexes from the first object with each dimension

    • “right”: use indexes from the last object with each dimension

    • “exact”: instead of aligning, raise ValueError when indexes to be aligned are not equal. This is the new Xarray default behavior.

    • “override”: if indexes are of same size, rewrite indexes to be those of the first object with that dimension. Indexes for the same dimension must have the same size in all objects.

  • **kwargs (dict[str, Any]) – Additional arguments passed on to xarray.open_mfdataset. Refer to the [3] xarray docs for accepted keyword arguments.

Returns:

xr.Dataset – The Dataset.

Notes

xarray.open_mfdataset opens the file with read-only access. When you modify values of a Dataset, even one linked to files on disk, only the in-memory copy you are manipulating in xarray is modified: the original file on disk is never touched.

References