xarray.Dataset.groupby#

Dataset.groupby(group=None, *, squeeze=False, restore_coord_dims=False, eagerly_compute_group=None, **groupers)[source]#

Returns a DatasetGroupBy object for performing grouped operations.

Parameters:
  • group (str or DataArray or IndexVariable or sequence of hashable or mapping of hashable to Grouper) – Array whose unique values should be used to group this array. If a Hashable, must be the name of a coordinate contained in this dataarray. If a dictionary, must map an existing variable name to a Grouper instance.

  • squeeze (False) – This argument is deprecated.

  • restore_coord_dims (bool, default: False) – If True, also restore the dimension order of multi-dimensional coordinates.

  • eagerly_compute_group (False, optional) – This argument is deprecated.

  • **groupers (Mapping of str to Grouper or Resampler) – Mapping of variable name to group by to Grouper or Resampler object. One of group or groupers must be provided. Only a single grouper is allowed at present.

Returns:

grouped (DatasetGroupBy) – A DatasetGroupBy object patterned after pandas.GroupBy that can be iterated over in the form of (unique_value, grouped_array) pairs.

Examples

>>> ds = xr.Dataset(
...     {"foo": (("x", "y"), np.arange(12).reshape((4, 3)))},
...     coords={"x": [10, 20, 30, 40], "letters": ("x", list("abba"))},
... )

Grouping by a single variable is easy

>>> ds.groupby("letters")
<DatasetGroupBy, grouped over 1 grouper(s), 2 groups in total:
    'letters': UniqueGrouper('letters'), 2/2 groups with labels 'a', 'b'>

Execute a reduction

>>> ds.groupby("letters").sum()
<xarray.Dataset> Size: 64B
Dimensions:  (letters: 2, y: 3)
Coordinates:
  * letters  (letters) object 16B 'a' 'b'
Dimensions without coordinates: y
Data variables:
    foo      (letters, y) int64 48B 9 11 13 9 11 13

Grouping by multiple variables

>>> ds.groupby(["letters", "x"])
<DatasetGroupBy, grouped over 2 grouper(s), 8 groups in total:
    'letters': UniqueGrouper('letters'), 2/2 groups with labels 'a', 'b'
    'x': UniqueGrouper('x'), 4/4 groups with labels 10, 20, 30, 40>

Use Grouper objects to express more complicated GroupBy operations

>>> from xarray.groupers import BinGrouper, UniqueGrouper
>>>
>>> ds.groupby(x=BinGrouper(bins=[5, 15, 25]), letters=UniqueGrouper()).sum()
<xarray.Dataset> Size: 144B
Dimensions:  (y: 3, x_bins: 2, letters: 2)
Coordinates:
  * x_bins   (x_bins) interval[int64, right] 32B (5, 15] (15, 25]
  * letters  (letters) object 16B 'a' 'b'
Dimensions without coordinates: y
Data variables:
    foo      (y, x_bins, letters) float64 96B 0.0 nan nan 3.0 ... nan nan 5.0

See also

GroupBy: Group and Bin Data

Users guide explanation of how to group and bin data.

Computational Patterns

Tutorial on Groupby() for windowed computation.

Grouped Computations

Tutorial on Groupby() demonstrating reductions, transformation and comparison with resample().

pandas.DataFrame.groupby Dataset.groupby_bins DataArray.groupby core.groupby.DatasetGroupBy Dataset.coarsen Dataset.resample DataArray.resample