15

I need to apply a very simple 'match statement' to the values in an xarray array:

  1. Where the value > 0, make 2
  2. Where the value == 0, make 0
  3. Where the value is NaN, make NaN

Here's my current solution. I'm using NaNs, .fillna, & type coercion in lieu of 2d indexing.

valid = date_by_items.notnull()
positive = date_by_items > 0
positive = positive * 2
result = positive.fillna(0.).where(valid)
result

This changes this:

In [20]: date_by_items = xr.DataArray(np.asarray((list(range(3)) * 10)).reshape(6,5), dims=('date','item'))
    ...: date_by_items
    ...: 
Out[20]: 
<xarray.DataArray (date: 6, item: 5)>
array([[0, 1, 2, 0, 1],
       [2, 0, 1, 2, 0],
       [1, 2, 0, 1, 2],
       [0, 1, 2, 0, 1],
       [2, 0, 1, 2, 0],
       [1, 2, 0, 1, 2]])
Coordinates:
  * date     (date) int64 0 1 2 3 4 5
  * item     (item) int64 0 1 2 3 4

... to this:

Out[22]: 
<xarray.DataArray (date: 6, item: 5)>
array([[ 0.,  2.,  2.,  0.,  2.],
       [ 2.,  0.,  2.,  2.,  0.],
       [ 2.,  2.,  0.,  2.,  2.],
       [ 0.,  2.,  2.,  0.,  2.],
       [ 2.,  0.,  2.,  2.,  0.],
       [ 2.,  2.,  0.,  2.,  2.]])
Coordinates:
  * date     (date) int64 0 1 2 3 4 5
  * item     (item) int64 0 1 2 3 4

While in pandas df[df>0] = 2 would be enough. Surely I'm doing something pedestrian and there's an terser way?

Maximilian
  • 7,512
  • 3
  • 50
  • 63

4 Answers4

17

xarray now supports .where(condition, other), so this is now valid:

result = date_by_items.where(date_by_items > 0, 2)
Maximilian
  • 7,512
  • 3
  • 50
  • 63
5

If you are happy to load your data in-memory as a NumPy array, you can modify the DataArray values in place with NumPy:

date_by_items.values[date_by_items.values > 0] = 2

The cleanest way to handle this would be if xarray supported the other argument to where, but we haven't implemented that yet (hopefully soon -- the groundwork has been laid!). When that works, you'll be able to write date_by_items.where(date_by_items > 0, 2).

Either way, you'll need to do this twice to apply both your criteria.

shoyer
  • 9,165
  • 1
  • 37
  • 55
0

You can use the where(condition, other) method indeed. But be aware that the other argument will be used where the condition is false. So the behavior in the other answers is incorrect, as they will put a 2 where date_by_items > 0 does not hold.

>>> date = list(range(0,6))
>>> item = list(range(0,5))
>>> date_by_items = xr.DataArray(np.asarray((list(range(3)) * 10)).reshape(6,5), coords=[date, item], dims=('date','item'))
>>> date_by_items
<xarray.DataArray (date: 6, item: 5)>
array([[0, 1, 2, 0, 1],
       [2, 0, 1, 2, 0],
       [1, 2, 0, 1, 2],
       [0, 1, 2, 0, 1],
       [2, 0, 1, 2, 0],
       [1, 2, 0, 1, 2]])
Coordinates:
  * date     (date) int64 0 1 2 3 4 5
  * item     (item) int64 0 1 2 3 4


>>> date_by_items.where(date_by_items > 0, 2)  # wrong behavior
<xarray.DataArray (date: 6, item: 5)>
array([[2, 1, 2, 2, 1],
       [2, 2, 1, 2, 2],
       [1, 2, 2, 1, 2],
       [2, 1, 2, 2, 1],
       [2, 2, 1, 2, 2],
       [1, 2, 2, 1, 2]])
Coordinates:
  * date     (date) int64 0 1 2 3 4 5
  * item     (item) int64 0 1 2 3 4

Instead, when you want the requested behavior, you either have to invert the condition or use the xarray.where(condition, x, y) method instead.

>>> date_by_items.where(date_by_items <= 0, 2)  # inverted condition
<xarray.DataArray (date: 6, item: 5)>
array([[0, 2, 2, 0, 2],
       [2, 0, 2, 2, 0],
       [2, 2, 0, 2, 2],
       [0, 2, 2, 0, 2],
       [2, 0, 2, 2, 0],
       [2, 2, 0, 2, 2]])
Coordinates:
  * date     (date) int64 0 1 2 3 4 5
  * item     (item) int64 0 1 2 3 4

>>> xarray.where(date_by_items > 0, 2, date_by_items)
<xarray.DataArray (date: 6, item: 5)>
array([[0, 2, 2, 0, 2],
       [2, 0, 2, 2, 0],
       [2, 2, 0, 2, 2],
       [0, 2, 2, 0, 2],
       [2, 0, 2, 2, 0],
       [2, 2, 0, 2, 2]])
Coordinates:
  * date     (date) int64 0 1 2 3 4 5
  * item     (item) int64 0 1 2 3 4
Stijn
  • 67
  • 3
0

Another concise way would be to do date_by_items.values[date_by_items.values > 0] = 2

Dmitry Deryabin
  • 1,518
  • 2
  • 14
  • 27