0

I have three xarray datasets, LATITUDE, LONGITUDE, WIND SPEED all having the same x,y dimensions. I want to assign the LATITUDE and LONGITUDE datasets as coordinates of WIND SPEED at each point in the x,y frame, so that the variable WIND SPEED has dimensions like this: WIND SPEED(LATITUDE, LONGITUDE).

How should I proceed? The input data is the output of a gridded weather model in Netcdf format. I have done some calculations from the input and I want to assign the coordinates to the outputs of the calculations(WIND SPEED). Later I want to do spatial interpolation with with nearest neighbor method, so that I can get a value at any lat,lon within the dataset. Latitude XArray Sample after importing:

array([[21.821693, 21.821693, 21.821693, ..., 21.821693, 21.821693,
        21.821693],
        ......................................................
       [30.20221 , 30.20221 , 30.20221 , ..., 30.20221 , 30.20221 ,
        30.20221 ]], dtype=float32)

Wind Speed Xarray:

array([[8.725852, 8.758366, 8.728758, ...,      nan,      nan,      nan],
       [8.502903, 8.563703, 8.574378, ...,      nan,      nan,      nan],
       ........]] dtype=float32)
  • 3
    To be able to help you, you must post a [good](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) question. You must have sample input data that can be copy/pasted, and an example of what you want the output table to look like – Ukrainian-serge Feb 26 '20 at 07:44
  • Please give some more information, what you want to do later with that data. Do you want to plot it? Additional analysis? Till now it seems to me, that you just want the data in an array or dataframe. If you have an evenly spaced grid, you could make a numpy array of rank 3. Otherwise you could just make a dataframe: `df=pd.DataFrame([Latitude,Longitude,WindSpeed])` – Lepakk Feb 26 '20 at 07:44
  • Updating the question. – user8277017 Feb 26 '20 at 08:16
  • How about assigning wind column as an index for your data frame of coordinates? – pari Feb 26 '20 at 08:38

2 Answers2

0

One solution would be to construct an list of dicts:

speed_lat_long_list_dict = 
[ {id:1, speed:'1', lat:'20.8', long: '-18.5}, 
  {id:2, speed:'3', lat:'24.8', long: '-14.5},
  ....
  {id:n, speed:'n', lat:'n', long: 'n} ]

This would avoid confusion of setting co-ordinates to a duplicate speed value. e.g what do we do if we have different co-ordinates for the same speed measurement.

This can be passed to a DataFrame should you want or you can process it using for loops or list comprehensions

Tooblippe
  • 3,433
  • 3
  • 17
  • 25
0

You can merge your datasets and then assign the desired variables as coordinates:

data = np.random.rand(50,50)

windspeed = xr.Dataset({'windspeed':(['x','y'], data)})
lattitude = xr.Dataset({'lattitude':(['x','y'], np.cos(data))})
longitude = xr.Dataset({'longitude':(['x','y'], np.sin(data))})

ds = xr.merge([windspeed, lattitude, longitude])

ds.set_coords(['lattitude','longitude'])
<xarray.Dataset>
Dimensions:    (x: 50, y: 50)
Coordinates:
    lattitude  (x, y) float64 0.7035 0.9987 0.917 0.9958 ... 0.593 0.93 0.7624
    longitude  (x, y) float64 0.7107 0.05069 0.3988 ... 0.8052 0.3675 0.6471
Dimensions without coordinates: x, y
Data variables:
    windspeed  (x, y) float64 0.7905 0.05071 0.4102 ... 0.936 0.3763 0.7037
bwc
  • 1,028
  • 7
  • 18