1

I have a netcdf file. I have two variables in this file: wspd_wrf_m and wspd_sodar_o. I want to read in the netcdf file and calculate the RMSE value between wspd_wrf_m and wspd_sodar_o.

The variables are with the dimensions (Days, times) which is (1094, 24) I want to calculate the RMSE from the last 365 days of the files. Can you help me with this?

I know I need to use:

from netCDF4 import Dataset
import numpy as np

g = Dataset('station_test_new.nc','r',format='NETCDF3_64BIT')
wspd_wrf = g.variables["wspd_wrf_m"][:,:]
wspd_sodar = g.variables["wspd_sodar_o"][:,:]

But how do I select the last 365 days of hourly data that I need and calculate RMSE from this?

HM14
  • 689
  • 1
  • 10
  • 30

1 Answers1

1

Selecting the last 365 days is a matter of slicing the arrays to the correct size. For example:

import numpy as np
var = np.zeros((1094, 24))
print(var.shape, var[729:,:].shape, var[-365:,:].shape)

which prints:

(1094, 24) (365, 24) (365, 24)

So both var[729:,:] and var[-365:,:] slice the last 365 days (with all hourly values) out of your 1094 day sized array.

There is more information / are more examples in the Numpy manual.

There are plenty of examples of how to calculate the RMSE in Python (e.g. this one). Please give that a try, and if you can't get it to work, update your question with your attempts.

Bart
  • 9,825
  • 5
  • 47
  • 73