Erase minutes and seconds from character dates in R

Question

I have this timestamp vector:

c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
"01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
"01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
"01/09/2019 10:52:20")

I'd like to strip off the minutes and seconds from the character vector so that I just have 01/09/2019 9 and 01/09/2019 10

What is the most efficient method to do so?

Interesting accepted answer, what is your definition of efficient? — Hector Haffenden, Mar 22 '19 at 12:53
ah I see, I don’t blame you :) next time you should probably put it in the question... — Hector Haffenden, Mar 24 '19 at 19:07

hmhensen · Answer 1 · 2019-03-21T01:32:11.673

Here's one.

datevec <- c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
      "01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
      "01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
      "01/09/2019 10:52:20")

format(as.POSIXct(datevec, format = "%d/%m/%Y %H:%M:%OS"), "%d/%m/%Y %H")

# Result
 [1] "01/09/2019 09" "01/09/2019 09" "01/09/2019 09" "01/09/2019 10" "01/09/2019 10" "01/09/2019 10"
 [7] "01/09/2019 10" "01/09/2019 10" "01/09/2019 10" "01/09/2019 10"

OzanStats · Answer 2 · 2019-03-21T01:30:34.290

2

What is your desired output class? How about this one:

v <- c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
  "01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
  "01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
  "01/09/2019 10:52:20")


strptime(v, "%m/%d/%Y %H")

edited Mar 21 '19 at 01:30

answered Mar 21 '19 at 01:13

OzanStats

2,756
1
13
26

Hector Haffenden · Answer 3 · 2019-03-22T12:59:58.933

This seems nice,

unlist(strsplit(mystring, split = ":", fixed=TRUE))[c(TRUE, FALSE,FALSE)]

(Made with help from here)

Alternative could be,

sapply(strsplit(mystring, split=':', fixed=TRUE), `[`, 1)

Using some benchmarks and recent comments by Ronak, that fixed=TRUE makes the methods a lot faster, we see that method four (the above method) is fastest,

mystring <- c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
              "01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
              "01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
              "01/09/2019 10:52:20")

microbenchmark(one = sapply(strsplit(mystring, split=':', fixed=TRUE), `[`, 1),
           two = unlist(lapply(mystring,function(x) strsplit(x,":", fixed=TRUE)[[1]][1])),
           three = strptime(mystring, "%m/%d/%Y %H"),
           four = unlist(strsplit(mystring, split = ":", fixed=TRUE))[c(TRUE, FALSE,FALSE)],
           five = format(as.POSIXct(mystring, format = "%d/%m/%Y %H:%M:%OS"), "%d/%m/%Y %H"), 
           six = gsub("(.*?):.*", "\\1", mystring),
           seven = str_extract(mystring, ".+(?=:.+:)"),
           times = 100000)



    Unit: microseconds
  expr     min      lq      mean  median       uq        max neval
   one  42.792  49.471  85.63742  52.572  57.1310  669280.96 1e+05
   two  64.637  70.618 114.16364  73.252  77.6840  582466.94 1e+05
 three 129.456 134.771 156.82308 136.188 139.2030  339715.94 1e+05
  four  12.860  15.641  22.75699  17.254  18.5440  305703.52 1e+05
  five 482.888 505.647 633.15388 512.880 552.1155  551274.28 1e+05
   six  37.889  43.121  52.79030  45.567  49.1880   32954.59 1e+05
 seven  53.432  59.051  88.05015  62.326  69.9320 1180361.17 1e+05

Very interesting, actually 4, with fixed=TRUE is fastest, changed the benchmark to show this. — Hector Haffenden, Mar 21 '19 at 01:43

score 0 · Answer 4 · answered Mar 21 '19 at 01:16

Another one:

dates <- c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
                  "01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
                  "01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
                  "01/09/2019 10:52:20")
unlist(lapply(dates,function(x) strsplit(x,":")[[1]][1]))

gives

 [1] "01/09/2019 9"  "01/09/2019 9"  "01/09/2019 9"  "01/09/2019 10" "01/09/2019 10"
 [6] "01/09/2019 10" "01/09/2019 10" "01/09/2019 10" "01/09/2019 10" "01/09/2019 10"

score 0 · Answer 5 · answered Mar 21 '19 at 01:56

0

Here is another one using gsub

Capture the pattern by () and \\1 to refer to the captured group, need ?to make regex lazy as there are multiple :.

gsub("(.*?):.*", "\\1", dates)

answered Mar 21 '19 at 01:56

MKa

2,248
16
22

score 0 · Accepted Answer · answered Mar 21 '19 at 17:06

You could also use str_extract from stringr:

date_strings <- c("01/09/2019 9:51:03", "01/09/2019 9:51:39", "01/09/2019 9:57:04", 
"01/09/2019 10:01:41", "01/09/2019 10:06:06", "01/09/2019 10:09:36", 
"01/09/2019 10:11:55", "01/09/2019 10:21:15", "01/09/2019 10:21:39", 
"01/09/2019 10:52:20")

str_extract(date_strings, ".+(?=:.+:)")

 [1] "01/09/2019 9"  "01/09/2019 9"  "01/09/2019 9"  "01/09/2019 10"
 [5] "01/09/2019 10" "01/09/2019 10" "01/09/2019 10" "01/09/2019 10"
 [9] "01/09/2019 10" "01/09/2019 10"

Erase minutes and seconds from character dates in R

6 Answers6