-1

I do have 2 datasets per 10 minutes on 34 years. In one of them, observations are made only every 3 hours and I would like to keep only the lines with those observations. It starts at midnight (included) and goes like: 3am, 6am, 9am etc.

Looks like this:

stn CODES               time1 pcp_type
1 SIO     - 1981-01-01 02:00:00     <NA>
2 SIO     - 1981-01-01 02:10:00     <NA>
3 SIO     - 1981-01-01 02:20:00     <NA>
4 SIO     - 1981-01-01 02:30:00     <NA>
5 SIO     - 1981-01-01 02:40:00     <NA>
6 SIO     - 1981-01-01 02:50:00     <NA> 

Now the idea would be to keep only lines which corresponds to every 3 hours and deleting the rest.

I saw some solution about sorting by value (e.g. is bigger than) but I didn't find a solution that could help me sort by hour ( %H == 3 etc).

Thank you in advance.

I've already sorted my time column as following:

SYNOP_SION$time1<-as.POSIXct(strptime(as.character(SYNOP_SION$time),format = "%Y%m%d%H%M"), tz="UTC")
Sanda
  • 1
  • 1
  • Can you show the expected output – akrun Jul 08 '19 at 13:14
  • I don't have an example, but I would like to keep only lines where there's value in column "CODES" which means every 3 hours starting at midnight. – Sanda Jul 08 '19 at 13:18
  • You can try something along these lines: https://stackoverflow.com/questions/31616873/what-is-the-r-equivalent-of-pandas-resample-method – José Luiz Ferreira Jul 08 '19 at 13:21
  • Hi everyone so here's the solution I found (as I do have data per 10 minutes, 3hours = 18X10 min): TEMP_SION3$TEST2<-((1:nrow(TEMP_SION3)-7)%%18) TEMP_SION3<-subset(TEMP_SION3, TEST2==0) – Sanda Jul 09 '19 at 12:55

1 Answers1

0

Here is an example with a vector:

# Creating sample time data
time1 <- seq(from = Sys.time(), length.out = 96, by = "hours")

# To get a T/F vector you can use to filter
as.integer(format(time1, "%H")) %in% seq.int(0, 21, 3)

# To see the filtered POSIXct vector:
time1[as.integer(format(time1, "%H")) %in% seq.int(0, 21, 3)]
Andrew
  • 5,028
  • 2
  • 11
  • 21