0

I have a csv of weather stations and their corresponding latitudes and longitudes. I also have a gridded dataset of temperature trends. I want to find and create an array of the temperature trends of the grid points that most closely match the latitude and longitude of the weather stations. Here's what I have so far:

from netCDF4 import Dataset as netcdf_dataset
import numpy as np
import xarray as xr
import pandas as pd

#open NASA GISS gridded temperature netcdf file
df = xr.open_dataset('BerkeleyTmaxTrends.nc')

#open csv of weather stations
CMStations=pd.read_csv('Slope95.csv')

#pull out latitude and longitude from station csv
Lat=CMStations.lat
Lon=CMStations.lon

#find trend value of nearest grid point
gridtrend=[]
for i in Lat:
    for j in Lon:
        pt=df.sel(lat=[i],lon=[j],method="nearest")
        gridtrend.append(pt.trend)

I don't think I'm looping through properly. The length of Lat and Lon from the weather station is csv is 225 so I want a final gridtrend array that also has a length of 225. When I use the code above I get a list that is 50625 long. How can I fix this loop?

Here is a screenshot of what my gridded temperature df looks like: enter image description here

Megan Martin
  • 221
  • 1
  • 9
  • 1
    Is `for i, j in zip(Lat, Lon)` what you want? It will iterate over pairs `(lat, lon)` and assign `i` and `j` respectively. – STerliakov Mar 10 '22 at 21:45
  • Looks like that gives me the proper length. Thank you! Now my question is that it looks like this is giving me a list of 225 xarray data arrays but I want is an array with just the trend value – Megan Martin Mar 10 '22 at 21:53
  • also related: [efficient way to extract data from netcdf files](/questions/69330668/efficient-way-to-extract-data-from-netcdf-files/69337183#69337183), [Select values along the ocean floor in xarray](/questions/71029386/select-values-along-the-ocean-floor-in-xarray/71037455#71037455) and [Fast/efficient way to extract data from multiple large NetCDF files](/questions/70879766/fast-efficient-way-to-extract-data-from-multiple-large-netcdf-files/70883692#70883692) – Michael Delgado Mar 10 '22 at 22:16
  • 1
    the key trick is `df.sel(lat=Lat.to_xarray(), lon=Lon.to_xarray(), method='nearest')`. This will grab the data only over the stations and re-index it efficiently so it has a station ID dimension rather than lat and lon. – Michael Delgado Mar 10 '22 at 22:21
  • Is there a way to find the second nearest point? – Megan Martin Sep 17 '22 at 14:29

0 Answers0