3

I'm new to R. My data has 600k objects defined by three attributes: Id, Date and TimeOfCall.

TimeofCall has a 00:00:00 format and range from 00:00:00 to 23:59:59.

I want to bin the TimeOfCall attribute, into 24 bins, each one representing hourly slot (first bin 00:00:00 to 00:59:59 and so on).

Can someone talk me through how to do this? I tried using cut() but apparently my format is not numeric. Thanks in advance!

bnjmn
  • 4,508
  • 4
  • 37
  • 52
Palcente
  • 625
  • 2
  • 7
  • 21

2 Answers2

3

While you could convert to a formal time representation, in this case it might be easier to just use substr:

test <- c("00:00:01","02:07:01","22:30:15")
as.numeric(substr(test,1,2))
#[1]  0  2 22

Using a POSIXct time to deal with it would also work, and might be handy if you plan on further calculations (differences in time etc):

testtime <- as.POSIXct(test,format="%H:%M:%S")
#[1]"2013-12-09 00:00:01 EST" "2013-12-09 02:07:01 EST" "2013-12-09 22:30:15 EST"
as.numeric(format(testtime,"%H"))
#[1]  0  2 22
thelatemail
  • 91,185
  • 12
  • 128
  • 188
  • this worked like a charm, thank you !! Second method is excellent! I'm sure I will be using it more often! Quick one, if I were to extract days of the week from date, could I use POSIXct as well? – Palcente Dec 09 '13 at 00:48
  • @Palcente - if you already have a Date variable (or a POSIXct / POSIXlt datetime), you can use `format` like `format(datevar,"%w")` where the result is 0-6, Sunday being 0. – thelatemail Dec 09 '13 at 01:02
  • Could you tell me what would be my POSIXct format if my date is as follows: 01-Jan-09... would it be format="%d-%b-%y" ? – Palcente Dec 09 '13 at 01:23
  • @Palcente - that's the right format, they are all listed in `?strptime`. These apply to all `Date` and `POSIXct/lt` formats. If you are dealing with dates without times, there's probably no real need to use `POSIXct`, `as.Date` will work fine. – thelatemail Dec 09 '13 at 01:27
  • Thank you for the answer, what if I want bins being 3 hour instead of 1? – Francis May 07 '15 at 03:56
  • @Fredom - you can use `cut ` to bin the hours up - search `[r] bins cut` on this site for examples. – thelatemail May 07 '15 at 04:02
0

You can use cut.POsixlt function. But you should coerce your data to a valid time object. here I am using handy hms from lubridate. And strftime to get the time format.

library(lubridate)
x <- c("09:10:01", "08:10:02",  "08:20:02","06:10:03 ", "Collided at 9:20:04 pm")
x.h <- strftime(cut(as.POSIXct(hms(x),origin=Sys.Date()),'hours'),
         format='%H:%M:%S')

data.frame(x,x.h)

                       x      x.h
1               09:10:01 10:00:00
2               08:10:02 09:00:00
3               08:20:02 09:00:00
4              06:10:03  07:00:00
5 Collided at 9:20:04 pm 22:00:00
agstudy
  • 119,832
  • 17
  • 199
  • 261