1

I am using .where() function to select time and certain criteria in xarray dataset.

import numpy as np
import xarray as xr

ds1 = xr.open_dataset('COD.nc')
ds2 = xr.open_dataset('CDNC.nc')
ds3 = xr.open_dataset('LWP.nc')
ds4 = xr.open_dataset('CTT.nc')   
ds5 = xr.open_dataset('CTP.nc')
ds6 = xr.open_dataset('CER.nc')  

ds11 = ds1.where((ds1.time == ds2.time))
ds22 = ds2.where((ds2.time == ds11.time))
ds33 = ds3.where((ds3.time == ds2.time))
ds44 = ds4.where((ds4.time == ds2.time))
ds55 = ds5.where((ds5.time == ds2.time))
ds66 = ds6.where((ds6.time == ds2.time))

COD = ds11.Cloud_Optical_Thickness
CDNC= ds22.Cloud_Droplet_Concentration
LWP = ds33.Cloud_Water_Path
CTT = ds44.Cloud_Top_Temperature
CTP = ds55.Cloud_Top_Pressure
CER = ds66.Cloud_Effective_Radius

cod  = COD.where((CTT >= 273.0) & (CTP > 680.0) & (CER > 4) & (COD > 4)) 
lwp  = LWP.where((CTT >= 273.0) & (CTP > 680.0) & (CER > 4) & (COD > 4)) 
cdnc = CDNC.where((CTT >= 273.0) & (CTP > 680.0) & (CER > 4) & (COD > 4))  

but its too slow....even for small dataset...... Dimension of my each dataset is (time: 7555, lat= 35, lon=71). Its running for more than two hours.... is there any way to fasten the performance? Thanks!!

  • 1
    Does this help? https://stackoverflow.com/questions/47180126/xarray-too-slow-for-performance-critical-code – IanQ Dec 29 '22 at 06:55
  • @IanQ I am not of that much expertise. would u help me to write ```apply_ufunc``` for my above example code. – HARSHBARDHAN KUMAR Dec 29 '22 at 06:59
  • I'm not familiar with it either. Before you jump the gun, have you tried running all of this in numpy first? If it's too slow in `xarray` maybe trying `numpy` first is the way to go. – IanQ Dec 29 '22 at 07:01
  • I don't wanna mesh-up with the dataset dimensality in numpy. in ```xarray``` dimesion of the dataset remain same as earlier one. – HARSHBARDHAN KUMAR Dec 29 '22 at 07:05
  • ¯\\_(ツ)_/¯ IDK what to tell you then. I'd recommend trying that since that might be an easy way out, especially if your dataset size isn't that big but it's ultimately your call. – IanQ Dec 29 '22 at 07:20
  • @HARSHBARDHANKUMAR, I'm wondering if you can update your question to use a toy dataset that we could play with? Questions like this are hard to answer because we can't see your data. Check out this page for some tips on writing a reproducible example: https://stackoverflow.com/help/minimal-reproducible-example – jhamman Jan 04 '23 at 17:49

0 Answers0