{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Calculate Time Averages from Time Series Data\n", "=============================================\n", "\n", "Author: [Tom Vo](https://github.com/tomvothecoder/)\n", "\n", "Date: 05/27/22\n", "\n", "Last Edited: 08/17/22 (v0.3.1)\n", "\n", "Related APIs:\n", "\n", "* [xarray.Dataset.temporal.average()](../generated/xarray.Dataset.temporal.average.rst)\n", "* [xarray.Dataset.temporal.group_average()](../generated/xarray.Dataset.temporal.group_average.rst)\n", "\n", "The data used in this example can be found through the [Earth System Grid Federation (ESGF) search portal](https://aims2.llnl.gov/metagrid/search)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview\n", "\n", "Suppose we have netCDF4 files for air temperature data (`tas`) with monthly, daily, and 3hr frequencies.\n", "\n", "We want to calculate averages using these files with the time dimension removed (a single time snapshot), and averages by time group (yearly, seasonal, and daily)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2018-11-28T20:51:35.958210Z", "start_time": "2018-11-28T20:51:35.936966Z" } }, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import xcdat\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Calculate averages with the time dimension removed (single snapshot)\n", "\n", "Related API: [xarray.Dataset.temporal.average()](../generated/xarray.Dataset.temporal.average.rst)\n", "\n", "Helpful knowledge:\n", "\n", "* The frequency for the time interval is inferred before calculating weights.\n", " * The frequency is inferred by calculating the minimum delta between time coordinates and using the conditional logic below. This frequency is used to calculate weights.\n", "\n", " ```python\n", " if min_delta < pd.Timedelta(days=1):\n", " return \"hour\"\n", " elif min_delta >= pd.Timedelta(days=1) and min_delta < pd.Timedelta(days=28):\n", " return \"day\"\n", " elif min_delta >= pd.Timedelta(days=28) and min_delta < pd.Timedelta(days=365):\n", " return \"month\"\n", " else:\n", " return \"year\"\n", " ```\n", "* Masked (missing) data is automatically handled.\n", " * The weight of masked (missing) data are excluded when averages are calculated. This is the same as giving them a weight of 0." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Open the ``Dataset``\n", "\n", "In this example, we will be calculating the time weighted averages with the time dimension removed (single snapshot) for monthly `tas` data.\n", "\n", "We are using xarray's OPeNDAP support to read a netCDF4 dataset file directly from its source. The data is not loaded over the network until we perform operations on it (e.g., temperature unit adjustment).\n", "\n", "*More information on the xarray's OPeNDAP support can be found [here](https://docs.xarray.dev/en/stable/user-guide/io.html#opendap).*" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
<xarray.Dataset>\n",
"Dimensions: (time: 1980, bnds: 2, lat: 145, lon: 192)\n",
"Coordinates:\n",
" * time (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12-16T12:00:00\n",
" * lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0\n",
" * lon (lon) float64 0.0 1.875 3.75 5.625 ... 352.5 354.4 356.2 358.1\n",
" height float64 2.0\n",
"Dimensions without coordinates: bnds\n",
"Data variables:\n",
" time_bnds (time, bnds) datetime64[ns] 1850-01-01 1850-02-01 ... 2015-01-01\n",
" lat_bnds (lat, bnds) float64 -90.0 -89.38 -89.38 ... 89.38 89.38 90.0\n",
" lon_bnds (lon, bnds) float64 -0.9375 0.9375 0.9375 ... 357.2 357.2 359.1\n",
" tas (time, lat, lon) float32 -27.19 -27.19 -27.19 ... -25.29 -25.29\n",
"Attributes: (12/49)\n",
" Conventions: CF-1.7 CMIP-6.2\n",
" activity_id: CMIP\n",
" branch_method: standard\n",
" branch_time_in_child: 0.0\n",
" branch_time_in_parent: 87658.0\n",
" creation_date: 2020-06-05T04:06:11Z\n",
" ... ...\n",
" version: v20200605\n",
" license: CMIP6 model data produced by CSIRO is li...\n",
" cmor_version: 3.4.0\n",
" _NCProperties: version=2,netcdf=4.6.2,hdf5=1.10.5\n",
" tracking_id: hdl:21.14100/af78ae5e-f3a6-4e99-8cfe-5f2...\n",
" DODS_EXTRA.Unlimited_Dimension: time<xarray.DataArray 'tas' (lat: 145, lon: 192)>\n",
"array([[-48.01481628, -48.01481628, -48.01481628, ..., -48.01481628,\n",
" -48.01481628, -48.01481628],\n",
" [-44.94085363, -44.97948214, -45.01815398, ..., -44.82408252,\n",
" -44.86273067, -44.9009281 ],\n",
" [-44.11875274, -44.23060624, -44.33960158, ..., -43.76766492,\n",
" -43.88593717, -44.00303006],\n",
" ...,\n",
" [-18.21076615, -18.17513373, -18.13957458, ..., -18.32720478,\n",
" -18.28428828, -18.2486193 ],\n",
" [-18.50778243, -18.49301854, -18.47902819, ..., -18.55410851,\n",
" -18.5406963 , -18.52413098],\n",
" [-19.07366375, -19.07366375, -19.07366375, ..., -19.07366375,\n",
" -19.07366375, -19.07366375]])\n",
"Coordinates:\n",
" * lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0\n",
" * lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1\n",
" height float64 2.0\n",
"Attributes:\n",
" operation: temporal_avg\n",
" mode: average\n",
" freq: month\n",
" weighted: True<xarray.Dataset>\n",
"Dimensions: (time: 1980, bnds: 2, lat: 145, lon: 192)\n",
"Coordinates:\n",
" * time (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12-16T12:00:00\n",
" * lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0\n",
" * lon (lon) float64 0.0 1.875 3.75 5.625 ... 352.5 354.4 356.2 358.1\n",
" height float64 2.0\n",
"Dimensions without coordinates: bnds\n",
"Data variables:\n",
" time_bnds (time, bnds) datetime64[ns] 1850-01-01 1850-02-01 ... 2015-01-01\n",
" lat_bnds (lat, bnds) float64 -90.0 -89.38 -89.38 ... 89.38 89.38 90.0\n",
" lon_bnds (lon, bnds) float64 -0.9375 0.9375 0.9375 ... 357.2 357.2 359.1\n",
" tas (time, lat, lon) float32 -27.19 -27.19 -27.19 ... -25.29 -25.29\n",
"Attributes: (12/49)\n",
" Conventions: CF-1.7 CMIP-6.2\n",
" activity_id: CMIP\n",
" branch_method: standard\n",
" branch_time_in_child: 0.0\n",
" branch_time_in_parent: 87658.0\n",
" creation_date: 2020-06-05T04:06:11Z\n",
" ... ...\n",
" version: v20200605\n",
" license: CMIP6 model data produced by CSIRO is li...\n",
" cmor_version: 3.4.0\n",
" _NCProperties: version=2,netcdf=4.6.2,hdf5=1.10.5\n",
" tracking_id: hdl:21.14100/af78ae5e-f3a6-4e99-8cfe-5f2...\n",
" DODS_EXTRA.Unlimited_Dimension: time<xarray.DataArray 'tas' (time: 165, lat: 145, lon: 192)>\n",
"array([[[-48.755733, -48.755733, -48.755733, ..., -48.755733,\n",
" -48.755733, -48.755733],\n",
" [-45.652065, -45.693024, -45.73506 , ..., -45.52128 ,\n",
" -45.563866, -45.60669 ],\n",
" [-44.775234, -44.905838, -45.03297 , ..., -44.37118 ,\n",
" -44.50631 , -44.640503],\n",
" ...,\n",
" [-20.505976, -20.481321, -20.454565, ..., -20.588959,\n",
" -20.557522, -20.530872],\n",
" [-20.797592, -20.784252, -20.775455, ..., -20.83268 ,\n",
" -20.823357, -20.807684],\n",
" [-21.201149, -21.201149, -21.201149, ..., -21.201149,\n",
" -21.201149, -21.201149]],\n",
"\n",
" [[-48.95255 , -48.95255 , -48.95255 , ..., -48.95255 ,\n",
" -48.95255 , -48.95255 ],\n",
" [-45.83191 , -45.864902, -45.89875 , ..., -45.73217 ,\n",
" -45.76544 , -45.798595],\n",
" [-44.935368, -45.037956, -45.13801 , ..., -44.61143 ,\n",
" -44.71986 , -44.829372],\n",
"...\n",
" [-14.916271, -14.899261, -14.88381 , ..., -14.99543 ,\n",
" -14.965137, -14.938532],\n",
" [-15.405922, -15.396681, -15.385955, ..., -15.432463,\n",
" -15.426056, -15.413568],\n",
" [-15.945 , -15.945 , -15.945 , ..., -15.945 ,\n",
" -15.945 , -15.945 ]],\n",
"\n",
" [[-47.59732 , -47.59732 , -47.59732 , ..., -47.59732 ,\n",
" -47.59732 , -47.59732 ],\n",
" [-44.721367, -44.763428, -44.803505, ..., -44.592392,\n",
" -44.634445, -44.678226],\n",
" [-43.85032 , -43.969563, -44.08714 , ..., -43.4709 ,\n",
" -43.596764, -43.72408 ],\n",
" ...,\n",
" [-14.52023 , -14.474079, -14.432307, ..., -14.675514,\n",
" -14.620932, -14.567368],\n",
" [-14.911236, -14.892309, -14.869016, ..., -14.982012,\n",
" -14.962668, -14.938723],\n",
" [-15.618406, -15.618406, -15.618406, ..., -15.618406,\n",
" -15.618406, -15.618406]]], dtype=float32)\n",
"Coordinates:\n",
" * lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0\n",
" * lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1\n",
" height float64 2.0\n",
" * time (time) object 1850-01-01 00:00:00 ... 2014-01-01 00:00:00\n",
"Attributes:\n",
" operation: temporal_avg\n",
" mode: group_average\n",
" freq: year\n",
" weighted: True<xarray.DataArray 'tas' (time: 661, lat: 145, lon: 192)>\n",
"array([[[-32.705883 , -32.705883 , -32.705883 , ..., -32.705883 ,\n",
" -32.705883 , -32.705883 ],\n",
" [-30.993767 , -31.037586 , -31.089327 , ..., -30.845623 ,\n",
" -30.894127 , -30.94401 ],\n",
" [-30.02515 , -30.145437 , -30.26419 , ..., -29.660372 ,\n",
" -29.78108 , -29.902878 ],\n",
" ...,\n",
" [-37.72314 , -37.685493 , -37.654167 , ..., -37.8262 ,\n",
" -37.790344 , -37.75683 ],\n",
" [-38.274647 , -38.263725 , -38.250145 , ..., -38.292183 ,\n",
" -38.290638 , -38.28456 ],\n",
" [-38.743587 , -38.743587 , -38.743587 , ..., -38.743587 ,\n",
" -38.743587 , -38.743587 ]],\n",
"\n",
" [[-54.290863 , -54.290863 , -54.290863 , ..., -54.290863 ,\n",
" -54.290863 , -54.290863 ],\n",
" [-51.117714 , -51.175236 , -51.230553 , ..., -50.935165 ,\n",
" -50.99657 , -51.056145 ],\n",
" [-50.318047 , -50.486664 , -50.649567 , ..., -49.79003 ,\n",
" -49.970078 , -50.14521 ],\n",
"...\n",
" [-12.342774 , -12.2246685 , -12.106632 , ..., -12.744922 ,\n",
" -12.609088 , -12.478392 ],\n",
" [-13.126404 , -13.066109 , -13.003876 , ..., -13.306077 ,\n",
" -13.258715 , -13.19972 ],\n",
" [-14.288469 , -14.288469 , -14.288469 , ..., -14.288469 ,\n",
" -14.288469 , -14.288469 ]],\n",
"\n",
" [[-28.990494 , -28.990494 , -28.990494 , ..., -28.990494 ,\n",
" -28.990494 , -28.990494 ],\n",
" [-28.192917 , -28.224579 , -28.261307 , ..., -28.095932 ,\n",
" -28.125992 , -28.15802 ],\n",
" [-27.607407 , -27.705643 , -27.805115 , ..., -27.311615 ,\n",
" -27.410828 , -27.508362 ],\n",
" ...,\n",
" [-24.256271 , -24.140594 , -24.037537 , ..., -24.61853 ,\n",
" -24.488495 , -24.36644 ],\n",
" [-24.629013 , -24.613388 , -24.549866 , ..., -24.752045 ,\n",
" -24.721603 , -24.666412 ],\n",
" [-25.28923 , -25.28923 , -25.28923 , ..., -25.28923 ,\n",
" -25.28923 , -25.28923 ]]], dtype=float32)\n",
"Coordinates:\n",
" * lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0\n",
" * lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1\n",
" height float64 2.0\n",
" * time (time) object 1850-01-01 00:00:00 ... 2015-01-01 00:00:00\n",
"Attributes:\n",
" operation: temporal_avg\n",
" mode: group_average\n",
" freq: season\n",
" weighted: True\n",
" dec_mode: DJF\n",
" drop_incomplete_djf: False<xarray.DataArray 'time' (time: 661)>\n",
"array([cftime.DatetimeProlepticGregorian(1850, 1, 1, 0, 0, 0, 0, has_year_zero=True),\n",
" cftime.DatetimeProlepticGregorian(1850, 4, 1, 0, 0, 0, 0, has_year_zero=True),\n",
" cftime.DatetimeProlepticGregorian(1850, 7, 1, 0, 0, 0, 0, has_year_zero=True),\n",
" ...,\n",
" cftime.DatetimeProlepticGregorian(2014, 7, 1, 0, 0, 0, 0, has_year_zero=True),\n",
" cftime.DatetimeProlepticGregorian(2014, 10, 1, 0, 0, 0, 0, has_year_zero=True),\n",
" cftime.DatetimeProlepticGregorian(2015, 1, 1, 0, 0, 0, 0, has_year_zero=True)],\n",
" dtype=object)\n",
"Coordinates:\n",
" height float64 2.0\n",
" * time (time) object 1850-01-01 00:00:00 ... 2015-01-01 00:00:00\n",
"Attributes:\n",
" bounds: time_bnds\n",
" axis: T\n",
" long_name: time\n",
" standard_name: time\n",
" _ChunkSizes: 1<xarray.Dataset>\n",
"Dimensions: (time: 18262, bnds: 2, lat: 145, lon: 192)\n",
"Coordinates:\n",
" * time (time) datetime64[ns] 1850-01-01T12:00:00 ... 1899-12-31T12:00:00\n",
" * lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0\n",
" * lon (lon) float64 0.0 1.875 3.75 5.625 ... 352.5 354.4 356.2 358.1\n",
" height float64 ...\n",
"Dimensions without coordinates: bnds\n",
"Data variables:\n",
" time_bnds (time, bnds) datetime64[ns] dask.array<chunksize=(18262, 2), meta=np.ndarray>\n",
" lat_bnds (lat, bnds) float64 dask.array<chunksize=(145, 2), meta=np.ndarray>\n",
" lon_bnds (lon, bnds) float64 dask.array<chunksize=(192, 2), meta=np.ndarray>\n",
" tas (time, lat, lon) float32 dask.array<chunksize=(794, 145, 192), meta=np.ndarray>\n",
"Attributes: (12/48)\n",
" Conventions: CF-1.7 CMIP-6.2\n",
" activity_id: CMIP\n",
" branch_method: standard\n",
" branch_time_in_child: 0.0\n",
" branch_time_in_parent: 21915.0\n",
" creation_date: 2019-11-15T17:30:04Z\n",
" ... ...\n",
" variant_label: r1i1p1f1\n",
" version: v20191115\n",
" cmor_version: 3.4.0\n",
" tracking_id: hdl:21.14100/a9d8ba3a-bcbf-4d54-9970-cfc...\n",
" license: CMIP6 model data produced by CSIRO is li...\n",
" DODS_EXTRA.Unlimited_Dimension: time<xarray.DataArray 'tas' (time: 600, lat: 145, lon: 192)>\n",
"dask.array<stack, shape=(600, 145, 192), dtype=float64, chunksize=(1, 145, 192), chunktype=numpy.ndarray>\n",
"Coordinates:\n",
" * lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0\n",
" * lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1\n",
" height float64 ...\n",
" * time (time) object 1850-01-01 00:00:00 ... 1899-12-01 00:00:00\n",
"Attributes:\n",
" operation: temporal_avg\n",
" mode: group_average\n",
" freq: month\n",
" weighted: True<xarray.DataArray 'tas' (time: 14608, lat: 145, lon: 192)>\n",
"dask.array<sub, shape=(14608, 145, 192), dtype=float32, chunksize=(913, 145, 192), chunktype=numpy.ndarray>\n",
"Coordinates:\n",
" * time (time) datetime64[ns] 2010-01-01T03:00:00 ... 2015-01-01\n",
" * lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0\n",
" * lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1\n",
" height float64 ...<xarray.DataArray 'tas' (time: 1827, lat: 145, lon: 192)>\n",
"dask.array<stack, shape=(1827, 145, 192), dtype=float64, chunksize=(1, 145, 192), chunktype=numpy.ndarray>\n",
"Coordinates:\n",
" * lat (lat) float64 -90.0 -88.75 -87.5 -86.25 ... 86.25 87.5 88.75 90.0\n",
" * lon (lon) float64 0.0 1.875 3.75 5.625 7.5 ... 352.5 354.4 356.2 358.1\n",
" height float64 ...\n",
" * time (time) object 2010-01-01 00:00:00 ... 2015-01-01 00:00:00\n",
"Attributes:\n",
" operation: temporal_avg\n",
" mode: group_average\n",
" freq: day\n",
" weighted: True