{ "cells": [ { "cell_type": "markdown", "id": "c94bbff9", "metadata": {}, "source": [ "# General Dataset Utilities\n", "\n", "Authors:\n", "\n", "* [Tom Vo](https://github.com/tomvothecoder/)\n", "* [Stephen Po-Chedley](https://github.com/pochedls/)\n", "\n", "\n", "Date: 05/26/22" ] }, { "cell_type": "markdown", "id": "ef67fe43", "metadata": {}, "source": [ "## Overview\n", "\n", "This notebook demonstrates the use of general utility methods available in `xcdat`, including\n", "the reorientation of the longitude axis, centering of time coordinates using time bounds, and\n", "adding and getting bounds." ] }, { "cell_type": "code", "execution_count": 1, "id": "0b4e8461", "metadata": {}, "outputs": [], "source": [ "import xcdat" ] }, { "cell_type": "markdown", "id": "6078eb43", "metadata": {}, "source": [ "## Open a dataset\n", "\n", "Datasets can be opened and read using `open_dataset()` or `open_mfdataset()` (multi-file).\n", "\n", "Related APIs:\n", "\n", "* [xcdat.open_dataset()](../generated/xcdat.open_dataset.rst)\n", "* [xcdat.open_mfdataset()](../generated/xcdat.open_mfdataset.rst)" ] }, { "cell_type": "code", "execution_count": 2, "id": "e83b0a2b", "metadata": {}, "outputs": [], "source": [ "dataset_links = [\n", " \"https://esgf-data2.llnl.gov/thredds/dodsC/user_pub_work/E3SM/1_0/amip_1850_aeroF/1deg_atm_60-30km_ocean/atmos/180x360/time-series/mon/ens2/v3/TS_187001_189412.nc\",\n", " \"https://esgf-data2.llnl.gov/thredds/dodsC/user_pub_work/E3SM/1_0/amip_1850_aeroF/1deg_atm_60-30km_ocean/atmos/180x360/time-series/mon/ens2/v3/TS_189501_191912.nc\",\n", "]\n" ] }, { "cell_type": "code", "execution_count": 3, "id": "e027623a", "metadata": {}, "outputs": [], "source": [ "# NOTE: Opening a multi-file dataset will result in data variables to be dask\n", "# arrays.\n", "ds = xcdat.open_mfdataset(dataset_links)" ] }, { "cell_type": "code", "execution_count": 4, "id": "37392c81", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset>\n",
       "Dimensions:    (lat: 180, lon: 360, nbnd: 2, time: 600)\n",
       "Coordinates:\n",
       "  * lat        (lat) float64 -89.5 -88.5 -87.5 -86.5 ... 86.5 87.5 88.5 89.5\n",
       "  * lon        (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 356.5 357.5 358.5 359.5\n",
       "  * time       (time) object 1870-02-01 00:00:00 ... 1920-01-01 00:00:00\n",
       "Dimensions without coordinates: nbnd\n",
       "Data variables:\n",
       "    lat_bnds   (lat, nbnd) float64 dask.array<chunksize=(180, 2), meta=np.ndarray>\n",
       "    lon_bnds   (lon, nbnd) float64 dask.array<chunksize=(360, 2), meta=np.ndarray>\n",
       "    gw         (lat) float64 dask.array<chunksize=(180,), meta=np.ndarray>\n",
       "    time_bnds  (time, nbnd) object dask.array<chunksize=(300, 2), meta=np.ndarray>\n",
       "    area       (lat, lon) float64 dask.array<chunksize=(180, 360), meta=np.ndarray>\n",
       "    TS         (time, lat, lon) float32 dask.array<chunksize=(300, 180, 360), meta=np.ndarray>\n",
       "Attributes: (12/21)\n",
       "    ne:                              30\n",
       "    np:                              4\n",
       "    Conventions:                     CF-1.0\n",
       "    source:                          CAM\n",
       "    case:                            20180622.DECKv1b_A2_1850aeroF.ne30_oEC.e...\n",
       "    title:                           UNSET\n",
       "    ...                              ...\n",
       "    remap_script:                    ncremap\n",
       "    remap_hostname:                  acme1\n",
       "    remap_version:                   4.9.6\n",
       "    map_file:                        /export/zender1/data/maps/map_ne30np4_to...\n",
       "    input_file:                      /p/user_pub/e3sm/baldwin32/workshop/amip...\n",
       "    DODS_EXTRA.Unlimited_Dimension:  time
" ], "text/plain": [ "\n", "Dimensions: (lat: 180, lon: 360, nbnd: 2, time: 600)\n", "Coordinates:\n", " * lat (lat) float64 -89.5 -88.5 -87.5 -86.5 ... 86.5 87.5 88.5 89.5\n", " * lon (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 356.5 357.5 358.5 359.5\n", " * time (time) object 1870-02-01 00:00:00 ... 1920-01-01 00:00:00\n", "Dimensions without coordinates: nbnd\n", "Data variables:\n", " lat_bnds (lat, nbnd) float64 dask.array\n", " lon_bnds (lon, nbnd) float64 dask.array\n", " gw (lat) float64 dask.array\n", " time_bnds (time, nbnd) object dask.array\n", " area (lat, lon) float64 dask.array\n", " TS (time, lat, lon) float32 dask.array\n", "Attributes: (12/21)\n", " ne: 30\n", " np: 4\n", " Conventions: CF-1.0\n", " source: CAM\n", " case: 20180622.DECKv1b_A2_1850aeroF.ne30_oEC.e...\n", " title: UNSET\n", " ... ...\n", " remap_script: ncremap\n", " remap_hostname: acme1\n", " remap_version: 4.9.6\n", " map_file: /export/zender1/data/maps/map_ne30np4_to...\n", " input_file: /p/user_pub/e3sm/baldwin32/workshop/amip...\n", " DODS_EXTRA.Unlimited_Dimension: time" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds" ] }, { "cell_type": "markdown", "id": "600415d8", "metadata": {}, "source": [ "## Reorient the longitude axis\n", "\n", "Longitude can be represented from 0 to 360 E or as 180 W to 180 E. ``xcdat`` allows you to convert between these axes systems.\n", "\n", "* Related API: [xcdat.swap_lon_axis()](../generated/xcdat.swap_lon_axis.rst)\n", "* Alternative solution: ``xcdat.open_mfdataset(dataset_links, lon_orient=(-180, 180))``\n" ] }, { "cell_type": "code", "execution_count": 5, "id": "9f0e4c8c", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'lon' (lon: 360)>\n",
       "array([  0.5,   1.5,   2.5, ..., 357.5, 358.5, 359.5])\n",
       "Coordinates:\n",
       "  * lon      (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 355.5 356.5 357.5 358.5 359.5\n",
       "Attributes:\n",
       "    long_name:      Longitude of Grid Cell Centers\n",
       "    standard_name:  longitude\n",
       "    units:          degrees_east\n",
       "    axis:           X\n",
       "    valid_min:      0.0\n",
       "    valid_max:      360.0\n",
       "    bounds:         lon_bnds
" ], "text/plain": [ "\n", "array([ 0.5, 1.5, 2.5, ..., 357.5, 358.5, 359.5])\n", "Coordinates:\n", " * lon (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 355.5 356.5 357.5 358.5 359.5\n", "Attributes:\n", " long_name: Longitude of Grid Cell Centers\n", " standard_name: longitude\n", " units: degrees_east\n", " axis: X\n", " valid_min: 0.0\n", " valid_max: 360.0\n", " bounds: lon_bnds" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds.lon" ] }, { "cell_type": "code", "execution_count": 6, "id": "178b80f6", "metadata": {}, "outputs": [], "source": [ "ds2 = xcdat.swap_lon_axis(ds, to=(-180,180))" ] }, { "cell_type": "code", "execution_count": 7, "id": "6c39a178", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'lon' (lon: 360)>\n",
       "array([-179.5, -178.5, -177.5, ...,  177.5,  178.5,  179.5])\n",
       "Coordinates:\n",
       "  * lon      (lon) float64 -179.5 -178.5 -177.5 -176.5 ... 177.5 178.5 179.5\n",
       "Attributes:\n",
       "    long_name:      Longitude of Grid Cell Centers\n",
       "    standard_name:  longitude\n",
       "    units:          degrees_east\n",
       "    axis:           X\n",
       "    valid_min:      0.0\n",
       "    valid_max:      360.0\n",
       "    bounds:         lon_bnds
" ], "text/plain": [ "\n", "array([-179.5, -178.5, -177.5, ..., 177.5, 178.5, 179.5])\n", "Coordinates:\n", " * lon (lon) float64 -179.5 -178.5 -177.5 -176.5 ... 177.5 178.5 179.5\n", "Attributes:\n", " long_name: Longitude of Grid Cell Centers\n", " standard_name: longitude\n", " units: degrees_east\n", " axis: X\n", " valid_min: 0.0\n", " valid_max: 360.0\n", " bounds: lon_bnds" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds2.lon" ] }, { "cell_type": "markdown", "id": "2deb2d28", "metadata": {}, "source": [ "## Center the time coordinates\n", "\n", "A given point of time often represents some time period (e.g., a monthly average). In this situation, data providers sometimes record the time as the beginning, middle, or end of the period. ``center_times()`` places the time coordinate in the center of the time interval (using time bounds to determine the center of the period).\n", "\n", "* Related API: [xcdat.center_times()](../generated/xcdat.center_times.rst)\n", "* Alternative solution: ``xcdat.open_mfdataset(dataset_links, center_times=True)``" ] }, { "cell_type": "markdown", "id": "b6ae5740", "metadata": {}, "source": [ "The time bounds used for centering time coordinates:" ] }, { "cell_type": "code", "execution_count": 8, "id": "2ca935db", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[cftime.DatetimeNoLeap(1870, 1, 1, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1870, 2, 1, 0, 0, 0, 0, has_year_zero=True)],\n", " [cftime.DatetimeNoLeap(1870, 2, 1, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1870, 3, 1, 0, 0, 0, 0, has_year_zero=True)],\n", " [cftime.DatetimeNoLeap(1870, 3, 1, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1870, 4, 1, 0, 0, 0, 0, has_year_zero=True)],\n", " ...,\n", " [cftime.DatetimeNoLeap(1919, 10, 1, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1919, 11, 1, 0, 0, 0, 0, has_year_zero=True)],\n", " [cftime.DatetimeNoLeap(1919, 11, 1, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1919, 12, 1, 0, 0, 0, 0, has_year_zero=True)],\n", " [cftime.DatetimeNoLeap(1919, 12, 1, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1920, 1, 1, 0, 0, 0, 0, has_year_zero=True)]],\n", " dtype=object)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# We access the values with .values because it is a dask array.\n", "ds.time_bnds.values" ] }, { "cell_type": "markdown", "id": "bcf8a44d", "metadata": {}, "source": [ "Before centering time coordinates:" ] }, { "cell_type": "code", "execution_count": 9, "id": "db10a5e0", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'time' (time: 600)>\n",
       "array([cftime.DatetimeNoLeap(1870, 2, 1, 0, 0, 0, 0, has_year_zero=True),\n",
       "       cftime.DatetimeNoLeap(1870, 3, 1, 0, 0, 0, 0, has_year_zero=True),\n",
       "       cftime.DatetimeNoLeap(1870, 4, 1, 0, 0, 0, 0, has_year_zero=True), ...,\n",
       "       cftime.DatetimeNoLeap(1919, 11, 1, 0, 0, 0, 0, has_year_zero=True),\n",
       "       cftime.DatetimeNoLeap(1919, 12, 1, 0, 0, 0, 0, has_year_zero=True),\n",
       "       cftime.DatetimeNoLeap(1920, 1, 1, 0, 0, 0, 0, has_year_zero=True)],\n",
       "      dtype=object)\n",
       "Coordinates:\n",
       "  * time     (time) object 1870-02-01 00:00:00 ... 1920-01-01 00:00:00\n",
       "Attributes:\n",
       "    long_name:     time\n",
       "    bounds:        time_bnds\n",
       "    cell_methods:  time: mean
" ], "text/plain": [ "\n", "array([cftime.DatetimeNoLeap(1870, 2, 1, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1870, 3, 1, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1870, 4, 1, 0, 0, 0, 0, has_year_zero=True), ...,\n", " cftime.DatetimeNoLeap(1919, 11, 1, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1919, 12, 1, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1920, 1, 1, 0, 0, 0, 0, has_year_zero=True)],\n", " dtype=object)\n", "Coordinates:\n", " * time (time) object 1870-02-01 00:00:00 ... 1920-01-01 00:00:00\n", "Attributes:\n", " long_name: time\n", " bounds: time_bnds\n", " cell_methods: time: mean" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds.time" ] }, { "cell_type": "code", "execution_count": 10, "id": "700783d7", "metadata": {}, "outputs": [], "source": [ "ds3 = xcdat.center_times(ds)" ] }, { "cell_type": "markdown", "id": "66db7abf", "metadata": {}, "source": [ "After centering time coordinates:" ] }, { "cell_type": "code", "execution_count": 11, "id": "9a0a0843", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'time' (time: 600)>\n",
       "array([cftime.DatetimeNoLeap(1870, 1, 16, 12, 0, 0, 0, has_year_zero=True),\n",
       "       cftime.DatetimeNoLeap(1870, 2, 15, 0, 0, 0, 0, has_year_zero=True),\n",
       "       cftime.DatetimeNoLeap(1870, 3, 16, 12, 0, 0, 0, has_year_zero=True),\n",
       "       ...,\n",
       "       cftime.DatetimeNoLeap(1919, 10, 16, 12, 0, 0, 0, has_year_zero=True),\n",
       "       cftime.DatetimeNoLeap(1919, 11, 16, 0, 0, 0, 0, has_year_zero=True),\n",
       "       cftime.DatetimeNoLeap(1919, 12, 16, 12, 0, 0, 0, has_year_zero=True)],\n",
       "      dtype=object)\n",
       "Coordinates:\n",
       "  * time     (time) object 1870-01-16 12:00:00 ... 1919-12-16 12:00:00\n",
       "Attributes:\n",
       "    long_name:     time\n",
       "    bounds:        time_bnds\n",
       "    cell_methods:  time: mean
" ], "text/plain": [ "\n", "array([cftime.DatetimeNoLeap(1870, 1, 16, 12, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1870, 2, 15, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1870, 3, 16, 12, 0, 0, 0, has_year_zero=True),\n", " ...,\n", " cftime.DatetimeNoLeap(1919, 10, 16, 12, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1919, 11, 16, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1919, 12, 16, 12, 0, 0, 0, has_year_zero=True)],\n", " dtype=object)\n", "Coordinates:\n", " * time (time) object 1870-01-16 12:00:00 ... 1919-12-16 12:00:00\n", "Attributes:\n", " long_name: time\n", " bounds: time_bnds\n", " cell_methods: time: mean" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds3.time" ] }, { "cell_type": "markdown", "id": "8a261070", "metadata": {}, "source": [ "## Add bounds\n", "\n", "Bounds are critical to many ``xcdat`` operations. For example, they are used in determining the weights in spatial or temporal averages and in regridding operations. ``add_bounds()`` will attempt to produce bounds if they do not exist in the original dataset.\n", "\n", "* Related API: [xarray.Dataset.bounds.add_bounds()](../generated/xarray.Dataset.bounds.add_bounds.rst)\n", "* Alternative solution: ``xcdat.open_mfdataset(dataset_links, add_bounds=True)``\n", " * (Assuming the file doesn't already have bounds for your desired axis/axes)" ] }, { "cell_type": "code", "execution_count": 12, "id": "59bf9f86", "metadata": {}, "outputs": [], "source": [ "# We are dropping the existing bounds to demonstrate adding bounds.\n", "ds4 = ds.drop_vars(\"time_bnds\")" ] }, { "cell_type": "code", "execution_count": 13, "id": "fb2b09dd", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\"Bounds were not found for the coordinate variable 'time'. They must be added (Dataset.bounds.add_bounds).\"\n" ] } ], "source": [ "try:\n", " ds4.bounds.get_bounds(\"T\")\n", "except KeyError as e:\n", " print(e)" ] }, { "cell_type": "code", "execution_count": 14, "id": "c4b55a89", "metadata": {}, "outputs": [], "source": [ "# A `width` kwarg can be specified, which is width of the bounds relative to \n", "# the position of the nearest points. The default value is 0.5.\n", "ds4 = ds4.bounds.add_bounds(\"T\", width=0.5)" ] }, { "cell_type": "code", "execution_count": 15, "id": "0b38ef37", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'time_bnds' (time: 600, bnds: 2)>\n",
       "array([[cftime.DatetimeNoLeap(1870, 1, 18, 0, 0, 0, 0, has_year_zero=True),\n",
       "        cftime.DatetimeNoLeap(1870, 2, 15, 0, 0, 0, 0, has_year_zero=True)],\n",
       "       [cftime.DatetimeNoLeap(1870, 2, 15, 0, 0, 0, 0, has_year_zero=True),\n",
       "        cftime.DatetimeNoLeap(1870, 3, 16, 12, 0, 0, 0, has_year_zero=True)],\n",
       "       [cftime.DatetimeNoLeap(1870, 3, 16, 12, 0, 0, 0, has_year_zero=True),\n",
       "        cftime.DatetimeNoLeap(1870, 4, 16, 0, 0, 0, 0, has_year_zero=True)],\n",
       "       ...,\n",
       "       [cftime.DatetimeNoLeap(1919, 10, 16, 12, 0, 0, 0, has_year_zero=True),\n",
       "        cftime.DatetimeNoLeap(1919, 11, 16, 0, 0, 0, 0, has_year_zero=True)],\n",
       "       [cftime.DatetimeNoLeap(1919, 11, 16, 0, 0, 0, 0, has_year_zero=True),\n",
       "        cftime.DatetimeNoLeap(1919, 12, 16, 12, 0, 0, 0, has_year_zero=True)],\n",
       "       [cftime.DatetimeNoLeap(1919, 12, 16, 12, 0, 0, 0, has_year_zero=True),\n",
       "        cftime.DatetimeNoLeap(1920, 1, 16, 12, 0, 0, 0, has_year_zero=True)]],\n",
       "      dtype=object)\n",
       "Coordinates:\n",
       "  * time     (time) object 1870-02-01 00:00:00 ... 1920-01-01 00:00:00\n",
       "Dimensions without coordinates: bnds\n",
       "Attributes:\n",
       "    xcdat_bounds:  True
" ], "text/plain": [ "\n", "array([[cftime.DatetimeNoLeap(1870, 1, 18, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1870, 2, 15, 0, 0, 0, 0, has_year_zero=True)],\n", " [cftime.DatetimeNoLeap(1870, 2, 15, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1870, 3, 16, 12, 0, 0, 0, has_year_zero=True)],\n", " [cftime.DatetimeNoLeap(1870, 3, 16, 12, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1870, 4, 16, 0, 0, 0, 0, has_year_zero=True)],\n", " ...,\n", " [cftime.DatetimeNoLeap(1919, 10, 16, 12, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1919, 11, 16, 0, 0, 0, 0, has_year_zero=True)],\n", " [cftime.DatetimeNoLeap(1919, 11, 16, 0, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1919, 12, 16, 12, 0, 0, 0, has_year_zero=True)],\n", " [cftime.DatetimeNoLeap(1919, 12, 16, 12, 0, 0, 0, has_year_zero=True),\n", " cftime.DatetimeNoLeap(1920, 1, 16, 12, 0, 0, 0, has_year_zero=True)]],\n", " dtype=object)\n", "Coordinates:\n", " * time (time) object 1870-02-01 00:00:00 ... 1920-01-01 00:00:00\n", "Dimensions without coordinates: bnds\n", "Attributes:\n", " xcdat_bounds: True" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds4.bounds.get_bounds(\"T\")" ] }, { "cell_type": "markdown", "id": "9c863354", "metadata": {}, "source": [ "## Add missing bounds for all axes supported by xcdat (X, Y, T, Z)\n", "\n", "* Related API: [xarray.Dataset.bounds.add_missing_bounds()](../generated/xarray.Dataset.bounds.add_missing_bounds.rst)" ] }, { "cell_type": "code", "execution_count": 16, "id": "c386886e", "metadata": {}, "outputs": [], "source": [ "# We drop the dataset axes bounds to demonstrate generating missing bounds.\n", "ds5 = ds.drop_vars([\"time_bnds\", \"lat_bnds\", \"lon_bnds\"])" ] }, { "cell_type": "code", "execution_count": 17, "id": "8c8114a5", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset>\n",
       "Dimensions:  (lat: 180, lon: 360, time: 600)\n",
       "Coordinates:\n",
       "  * lat      (lat) float64 -89.5 -88.5 -87.5 -86.5 -85.5 ... 86.5 87.5 88.5 89.5\n",
       "  * lon      (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 355.5 356.5 357.5 358.5 359.5\n",
       "  * time     (time) object 1870-02-01 00:00:00 ... 1920-01-01 00:00:00\n",
       "Data variables:\n",
       "    gw       (lat) float64 dask.array<chunksize=(180,), meta=np.ndarray>\n",
       "    area     (lat, lon) float64 dask.array<chunksize=(180, 360), meta=np.ndarray>\n",
       "    TS       (time, lat, lon) float32 dask.array<chunksize=(300, 180, 360), meta=np.ndarray>\n",
       "Attributes: (12/21)\n",
       "    ne:                              30\n",
       "    np:                              4\n",
       "    Conventions:                     CF-1.0\n",
       "    source:                          CAM\n",
       "    case:                            20180622.DECKv1b_A2_1850aeroF.ne30_oEC.e...\n",
       "    title:                           UNSET\n",
       "    ...                              ...\n",
       "    remap_script:                    ncremap\n",
       "    remap_hostname:                  acme1\n",
       "    remap_version:                   4.9.6\n",
       "    map_file:                        /export/zender1/data/maps/map_ne30np4_to...\n",
       "    input_file:                      /p/user_pub/e3sm/baldwin32/workshop/amip...\n",
       "    DODS_EXTRA.Unlimited_Dimension:  time
" ], "text/plain": [ "\n", "Dimensions: (lat: 180, lon: 360, time: 600)\n", "Coordinates:\n", " * lat (lat) float64 -89.5 -88.5 -87.5 -86.5 -85.5 ... 86.5 87.5 88.5 89.5\n", " * lon (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 355.5 356.5 357.5 358.5 359.5\n", " * time (time) object 1870-02-01 00:00:00 ... 1920-01-01 00:00:00\n", "Data variables:\n", " gw (lat) float64 dask.array\n", " area (lat, lon) float64 dask.array\n", " TS (time, lat, lon) float32 dask.array\n", "Attributes: (12/21)\n", " ne: 30\n", " np: 4\n", " Conventions: CF-1.0\n", " source: CAM\n", " case: 20180622.DECKv1b_A2_1850aeroF.ne30_oEC.e...\n", " title: UNSET\n", " ... ...\n", " remap_script: ncremap\n", " remap_hostname: acme1\n", " remap_version: 4.9.6\n", " map_file: /export/zender1/data/maps/map_ne30np4_to...\n", " input_file: /p/user_pub/e3sm/baldwin32/workshop/amip...\n", " DODS_EXTRA.Unlimited_Dimension: time" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds5" ] }, { "cell_type": "code", "execution_count": 18, "id": "229a1e4a", "metadata": {}, "outputs": [], "source": [ "ds5 = ds5.bounds.add_missing_bounds(width=0.5)" ] }, { "cell_type": "code", "execution_count": 19, "id": "9789b2c4", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset>\n",
       "Dimensions:    (lat: 180, lon: 360, time: 600, bnds: 2)\n",
       "Coordinates:\n",
       "  * lat        (lat) float64 -89.5 -88.5 -87.5 -86.5 ... 86.5 87.5 88.5 89.5\n",
       "  * lon        (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 356.5 357.5 358.5 359.5\n",
       "  * time       (time) object 1870-02-01 00:00:00 ... 1920-01-01 00:00:00\n",
       "Dimensions without coordinates: bnds\n",
       "Data variables:\n",
       "    gw         (lat) float64 dask.array<chunksize=(180,), meta=np.ndarray>\n",
       "    area       (lat, lon) float64 dask.array<chunksize=(180, 360), meta=np.ndarray>\n",
       "    TS         (time, lat, lon) float32 dask.array<chunksize=(300, 180, 360), meta=np.ndarray>\n",
       "    lon_bnds   (lon, bnds) float64 0.0 1.0 1.0 2.0 ... 358.0 359.0 359.0 360.0\n",
       "    lat_bnds   (lat, bnds) float64 -90.0 -89.0 -89.0 -88.0 ... 89.0 89.0 90.0\n",
       "    time_bnds  (time, bnds) object 1870-01-18 00:00:00 ... 1920-01-16 12:00:00\n",
       "Attributes: (12/21)\n",
       "    ne:                              30\n",
       "    np:                              4\n",
       "    Conventions:                     CF-1.0\n",
       "    source:                          CAM\n",
       "    case:                            20180622.DECKv1b_A2_1850aeroF.ne30_oEC.e...\n",
       "    title:                           UNSET\n",
       "    ...                              ...\n",
       "    remap_script:                    ncremap\n",
       "    remap_hostname:                  acme1\n",
       "    remap_version:                   4.9.6\n",
       "    map_file:                        /export/zender1/data/maps/map_ne30np4_to...\n",
       "    input_file:                      /p/user_pub/e3sm/baldwin32/workshop/amip...\n",
       "    DODS_EXTRA.Unlimited_Dimension:  time
" ], "text/plain": [ "\n", "Dimensions: (lat: 180, lon: 360, time: 600, bnds: 2)\n", "Coordinates:\n", " * lat (lat) float64 -89.5 -88.5 -87.5 -86.5 ... 86.5 87.5 88.5 89.5\n", " * lon (lon) float64 0.5 1.5 2.5 3.5 4.5 ... 356.5 357.5 358.5 359.5\n", " * time (time) object 1870-02-01 00:00:00 ... 1920-01-01 00:00:00\n", "Dimensions without coordinates: bnds\n", "Data variables:\n", " gw (lat) float64 dask.array\n", " area (lat, lon) float64 dask.array\n", " TS (time, lat, lon) float32 dask.array\n", " lon_bnds (lon, bnds) float64 0.0 1.0 1.0 2.0 ... 358.0 359.0 359.0 360.0\n", " lat_bnds (lat, bnds) float64 -90.0 -89.0 -89.0 -88.0 ... 89.0 89.0 90.0\n", " time_bnds (time, bnds) object 1870-01-18 00:00:00 ... 1920-01-16 12:00:00\n", "Attributes: (12/21)\n", " ne: 30\n", " np: 4\n", " Conventions: CF-1.0\n", " source: CAM\n", " case: 20180622.DECKv1b_A2_1850aeroF.ne30_oEC.e...\n", " title: UNSET\n", " ... ...\n", " remap_script: ncremap\n", " remap_hostname: acme1\n", " remap_version: 4.9.6\n", " map_file: /export/zender1/data/maps/map_ne30np4_to...\n", " input_file: /p/user_pub/e3sm/baldwin32/workshop/amip...\n", " DODS_EXTRA.Unlimited_Dimension: time" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds5" ] }, { "attachments": {}, "cell_type": "markdown", "id": "2ab105f4", "metadata": {}, "source": [ "## Get the dimension coordinates for an axis.\n", "\n", "In `xarray`, you can get a dimension coordinates by directly referencing its name (e.g., `ds.lat`). `xcdat` provides an alternative way to get dimension coordinates agnostically by simply passing the CF axis key to applicable APIs.\n", "\n", "* Related API: [xcdat.get_dim_coords()](../generated/xcdat.get_dim_coords.rst)\n", "\n", "Helpful knowledge:\n", "\n", "* This API uses ``cf_xarray`` to interpret CF axis names and coordinate names in the xarray object attributes. Refer to [Metadata Interpretation](../faqs.rst) for more information.\n", "\n", "Xarray documentation on coordinates ([source](https://docs.xarray.dev/en/stable/user-guide/data-structures.html#coordinates)):\n", "\n", "* There are two types of coordinates in xarray:\n", "\n", " * **dimension coordinates** are one dimensional coordinates with a name equal to their sole dimension (marked by * when printing a dataset or data array). They are used for label based indexing and alignment, like the index found on a pandas DataFrame or Series. Indeed, these “dimension” coordinates use a pandas.Index internally to store their values.\n", "\n", " * **non-dimension coordinates** are variables that contain coordinate data, but are not a dimension coordinate. They can be multidimensional (see Working with Multidimensional Coordinates), and there is no relationship between the name of a non-dimension coordinate and the name(s) of its dimension(s). Non-dimension coordinates can be useful for indexing or plotting; otherwise, xarray does not make any direct use of the values associated with them. They are not used for alignment or automatic indexing, nor are they required to match when doing arithmetic (see Coordinates).\n", "\n", "* Xarray’s terminology differs from the [CF terminology](https://cfconventions.org/cf-conventions/v1.6.0/cf-conventions.html#terminology), where the “dimension coordinates” are called “coordinate variables”, and the “non-dimension coordinates” are called “auxiliary coordinate variables” (see [GH1295](https://github.com/pydata/xarray/issues/1295) for more details).\n", "\n", "\n", "\n" ] }, { "attachments": {}, "cell_type": "markdown", "id": "15b5441d", "metadata": {}, "source": [ "\n", "### 1. `axis` attr" ] }, { "cell_type": "code", "execution_count": 20, "id": "5255323c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Y'" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds.lat.attrs[\"axis\"]" ] }, { "cell_type": "markdown", "id": "08505d5c", "metadata": {}, "source": [ "### 2. `standard_name` attr" ] }, { "cell_type": "code", "execution_count": 21, "id": "bf3d99a1", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'latitude'" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds.lat.attrs[\"standard_name\"]" ] }, { "cell_type": "code", "execution_count": 22, "id": "06fd086a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\"lat\" in ds.dims" ] }, { "cell_type": "code", "execution_count": 24, "id": "5d485fa1", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'lat' (lat: 180)>\n",
       "array([-89.5, -88.5, -87.5, -86.5, -85.5, -84.5, -83.5, -82.5, -81.5, -80.5,\n",
       "       -79.5, -78.5, -77.5, -76.5, -75.5, -74.5, -73.5, -72.5, -71.5, -70.5,\n",
       "       -69.5, -68.5, -67.5, -66.5, -65.5, -64.5, -63.5, -62.5, -61.5, -60.5,\n",
       "       -59.5, -58.5, -57.5, -56.5, -55.5, -54.5, -53.5, -52.5, -51.5, -50.5,\n",
       "       -49.5, -48.5, -47.5, -46.5, -45.5, -44.5, -43.5, -42.5, -41.5, -40.5,\n",
       "       -39.5, -38.5, -37.5, -36.5, -35.5, -34.5, -33.5, -32.5, -31.5, -30.5,\n",
       "       -29.5, -28.5, -27.5, -26.5, -25.5, -24.5, -23.5, -22.5, -21.5, -20.5,\n",
       "       -19.5, -18.5, -17.5, -16.5, -15.5, -14.5, -13.5, -12.5, -11.5, -10.5,\n",
       "        -9.5,  -8.5,  -7.5,  -6.5,  -5.5,  -4.5,  -3.5,  -2.5,  -1.5,  -0.5,\n",
       "         0.5,   1.5,   2.5,   3.5,   4.5,   5.5,   6.5,   7.5,   8.5,   9.5,\n",
       "        10.5,  11.5,  12.5,  13.5,  14.5,  15.5,  16.5,  17.5,  18.5,  19.5,\n",
       "        20.5,  21.5,  22.5,  23.5,  24.5,  25.5,  26.5,  27.5,  28.5,  29.5,\n",
       "        30.5,  31.5,  32.5,  33.5,  34.5,  35.5,  36.5,  37.5,  38.5,  39.5,\n",
       "        40.5,  41.5,  42.5,  43.5,  44.5,  45.5,  46.5,  47.5,  48.5,  49.5,\n",
       "        50.5,  51.5,  52.5,  53.5,  54.5,  55.5,  56.5,  57.5,  58.5,  59.5,\n",
       "        60.5,  61.5,  62.5,  63.5,  64.5,  65.5,  66.5,  67.5,  68.5,  69.5,\n",
       "        70.5,  71.5,  72.5,  73.5,  74.5,  75.5,  76.5,  77.5,  78.5,  79.5,\n",
       "        80.5,  81.5,  82.5,  83.5,  84.5,  85.5,  86.5,  87.5,  88.5,  89.5])\n",
       "Coordinates:\n",
       "  * lat      (lat) float64 -89.5 -88.5 -87.5 -86.5 -85.5 ... 86.5 87.5 88.5 89.5\n",
       "Attributes:\n",
       "    long_name:      Latitude of Grid Cell Centers\n",
       "    standard_name:  latitude\n",
       "    units:          degrees_north\n",
       "    axis:           Y\n",
       "    valid_min:      -90.0\n",
       "    valid_max:      90.0\n",
       "    bounds:         lat_bnds
" ], "text/plain": [ "\n", "array([-89.5, -88.5, -87.5, -86.5, -85.5, -84.5, -83.5, -82.5, -81.5, -80.5,\n", " -79.5, -78.5, -77.5, -76.5, -75.5, -74.5, -73.5, -72.5, -71.5, -70.5,\n", " -69.5, -68.5, -67.5, -66.5, -65.5, -64.5, -63.5, -62.5, -61.5, -60.5,\n", " -59.5, -58.5, -57.5, -56.5, -55.5, -54.5, -53.5, -52.5, -51.5, -50.5,\n", " -49.5, -48.5, -47.5, -46.5, -45.5, -44.5, -43.5, -42.5, -41.5, -40.5,\n", " -39.5, -38.5, -37.5, -36.5, -35.5, -34.5, -33.5, -32.5, -31.5, -30.5,\n", " -29.5, -28.5, -27.5, -26.5, -25.5, -24.5, -23.5, -22.5, -21.5, -20.5,\n", " -19.5, -18.5, -17.5, -16.5, -15.5, -14.5, -13.5, -12.5, -11.5, -10.5,\n", " -9.5, -8.5, -7.5, -6.5, -5.5, -4.5, -3.5, -2.5, -1.5, -0.5,\n", " 0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5,\n", " 10.5, 11.5, 12.5, 13.5, 14.5, 15.5, 16.5, 17.5, 18.5, 19.5,\n", " 20.5, 21.5, 22.5, 23.5, 24.5, 25.5, 26.5, 27.5, 28.5, 29.5,\n", " 30.5, 31.5, 32.5, 33.5, 34.5, 35.5, 36.5, 37.5, 38.5, 39.5,\n", " 40.5, 41.5, 42.5, 43.5, 44.5, 45.5, 46.5, 47.5, 48.5, 49.5,\n", " 50.5, 51.5, 52.5, 53.5, 54.5, 55.5, 56.5, 57.5, 58.5, 59.5,\n", " 60.5, 61.5, 62.5, 63.5, 64.5, 65.5, 66.5, 67.5, 68.5, 69.5,\n", " 70.5, 71.5, 72.5, 73.5, 74.5, 75.5, 76.5, 77.5, 78.5, 79.5,\n", " 80.5, 81.5, 82.5, 83.5, 84.5, 85.5, 86.5, 87.5, 88.5, 89.5])\n", "Coordinates:\n", " * lat (lat) float64 -89.5 -88.5 -87.5 -86.5 -85.5 ... 86.5 87.5 88.5 89.5\n", "Attributes:\n", " long_name: Latitude of Grid Cell Centers\n", " standard_name: latitude\n", " units: degrees_north\n", " axis: Y\n", " valid_min: -90.0\n", " valid_max: 90.0\n", " bounds: lat_bnds" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "xcdat.get_axis_coord(ds, axis=\"Y\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.9.12 ('xcdat_dev')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" }, "vscode": { "interpreter": { "hash": "937205ea97caa5d37934716e0a0c6b9e4ffcdaf6e0f20ca1ff82361fb532cef2" } } }, "nbformat": 4, "nbformat_minor": 5 }