1

I am a very beginner of nco, and I want to split my .nc file (from 1996010110 to 2019123110) as daily file, from 10AM to 10PM. In that case, each split file contains YYYY-MM-DD:10:00 to YYYY-MM-(DD+1):10:00. Note that the end hour of DD day is repeated in the beginning of next day. That is the data of YYYY-MM-DD:10:00 occurs twice in file_YYYY_MM_DD.nc as the starting data and also the ending data of file_YYYY_MM_(DD-1).nc. Thanks!

ClimateUnboxed
  • 7,106
  • 3
  • 41
  • 86
Xu Shan
  • 175
  • 3
  • 11
  • The question as written is not self-consistent, the first part says you want to cut from from 10am to 10pm, (i.e. 12 or 13 steps assuming the data is hourly, you don't say) - but then you say you want to run from 10am to 10am, repeating the 10am step in each file. Please can you clarify what the time resolution is and how many steps you want in the output? – ClimateUnboxed Sep 21 '21 at 20:48
  • Hi I think I am consistent enough, just expect for whether it's a closed interval or an open interval. Plz see the comment below your answer. By "repeating the 10 am step in each file", I meant the file covers [10am, 11am, ..., 23:00, ..., 10am in next day]. So you can see there are 25 hours per files, because the 10 am in one day is included in two files (the file for this day and also the file for last day). Am I clear enough? – Xu Shan Sep 23 '21 at 08:30
  • but you say 10pm in your title! – ClimateUnboxed Sep 24 '21 at 13:14
  • I edited the title to make it consistent with the question as stated, hope it is okay as written – ClimateUnboxed Sep 26 '21 at 21:20
  • Hi @AdrianTompkins I just saw your comments..so sorry for this late reply...yes the original question is from 10am to 10am next day (10am, 11am,...,9am next day, 10am next day) which consists of 25 hours... – Xu Shan Mar 21 '22 at 19:04

2 Answers2

3

There is a CDO command that allows you select a subrange of hours:

cdo selhour,10/22 in.nc out.nc 

which would answer the first part of the question, but from my comment you will see that the question needs further clarification.

ClimateUnboxed
  • 7,106
  • 3
  • 41
  • 86
  • Hi Adrian, thanks for your answer! The second part means that, there is one hour duplicated in two adjacent files. For example, the first one is 1996-01-01-10:00 to 1996-01-02-10:00, which means the time index covers 25 hours ([0,1,...,24] started from 1996-01-01-00:00). Then the second file documents the next day from 1996-01-02-10:00 to 1997-01-03-10:00 which also covers 25 hours ([0,1,...,24] started from 1996-01-02-10:00). Hopefully my explanation helps! Thanks again! – Xu Shan Sep 22 '21 at 17:46
2

The way to do this in NCO is to wrap a loop over time around use the sub-cycling form of the hyperslabber to eliminate the dupicate timestamps then loop over the days to create each file similar to this documented example here. For input where the first desired record is index 10, the last desired index is unbounded, the number of records in a repeating series (i.e., the stride between groups) is 25, and the number of consecutive desired records (the desired subset of a group) is 24, the first command would like this:

ncrcat -d time,10,,25,24 in.nc out.nc

Then out.nc will contain thousands of days of data with no repeated timesteps, and you can split that file into daily files however you like, including with ncrcat wrapped in a loop something like

EDIT 20210924: Based on clarification below you can ignore the above part of this message and proceed directly to this loop, which has been modified to extract 25 timesteps per day.

for yr in {1996..2019}; do
  for mth in {1..12}; do
    for day in {1..${dpm[mth]}}; do # Days-per-month array exercise left for the reader :)
      yyyy=`printf "%04d" $yr`
      mm=`printf "%02d" $mth`
      dd=`printf "%02d" $day`
      ncrcat -d time,${yyyy}-${mm}-${dd}T10:00:00,${yyyy}-${mm}-${ddp1}T10:00:00 out.nc file_${yyyy}_${mm}_${dd}.nc
    done
  done
done
Charlie Zender
  • 5,929
  • 14
  • 19
  • Hi Charlie, thanks for your answer! But I need repeated data...which starts from 10:00 on current day (D) and ends on 10:00 on next day (D+1). That means, the 10:00 on next day should also be included in file D (the end of file D), and also in file D+1 (the beginning of file D+1)...does your explanation work for this case? Thanks! – Xu Shan Sep 24 '21 at 07:00
  • So have 24 timesteps per day in the input file, and you want 25 timesteps per day in the output file? That was not clear in your original question. In that case, ignore the first use of ncrcat and modify the time requested by the ncrcat in the loop to whatever you want. I will change it now so get 25 timesteps in each output file. – Charlie Zender Sep 24 '21 at 15:08
  • Hi Charlie, I just saw your results...so sorry for my late reply and many thanks for your answers! I will try your answers. But one thing is that, how can I change the saving path for the "file_${yyyy}_${mm}_${dd}.nc"? just like this "xxx/xxx1/file_${yyyy}_${mm}_${dd}.nc"? – Xu Shan Mar 21 '22 at 19:05
  • I do not understand the question in the comment above. It seems to answer itself. These names are assembled from shell variables, so just express the desired output name in terms of the desired shell variables. – Charlie Zender Mar 21 '22 at 23:45
  • Hi @Charlie Zender, thanks for your reply and your answer! I tried with the command, but just got an error message like: ncrcat: ERROR no variables fit criteria for processing ncrcat: HINT Extraction list must contain a record variable to concatenate. A record variable is a variable defined with a record dimension. Often the record dimension, aka unlimited dimension, refers to time. To change an existing dimension from a fixed to a record dimensions see http://nco.sf.net/nco.html#mk_rec_dmn or to add a new record dimension to all variables see http://nco.sf.net/nco.html#ncecat_rnm – Xu Shan Apr 06 '22 at 10:24
  • I tried with ```ncrcat -d time,${yyyy}-${mm}-${dd}T10:00:00,${yyyy}-${mm}-${dd}T22:00:00 FORCING_199601.nc file_${yyyy}_${mm}_${dd}.nc``` where ```FORCING_199601.nc``` is the nc file which I want to extract a time series from it. – Xu Shan Apr 06 '22 at 10:24
  • I checked my nc file, the time contains array like ```time = 0, 1, 2, 3, 4, ... ,744``` which is the hour index in the jan of 1996. So maybe using ```${yyyy}-${mm}-${dd}T10:00:00``` will lead to problems? in this case, how can I use the ```ncrcat```? Thanks! – Xu Shan Apr 06 '22 at 10:26
  • The ERROR message you copied three messages above this diagnoses the problem and tells you what to do. Your files have time, but time is not a record dimension. Convert time to a record dimension in each file as instructed before using ncrcat. – Charlie Zender Apr 07 '22 at 15:49
  • Hi @Charlie Zender, thanks for your answer! but how can we do it reversely? I mean append files like ```file_${yyyy}_${mm}_${dd}.nc``` along ```time``` dimension back, and get the original file ```out.nc```? Thanks! – Xu Shan Apr 11 '22 at 09:06
  • I think it's perfectly possible to do that with some combination of ncrcat, sub-cycling (documentation is linked to above) and shell globbing (instead of shell loops) for the input filenames. After pointing you in the right direction and to the documentation, my role is here is done. – Charlie Zender Apr 11 '22 at 21:24