In your particular example, let's break it down first to see why you are getting NA
values, and then generate a solution that creates your desired results.
> date <- c("1999.04.08", "1999.04.08")
> time <- c("11:00", "12:00")
> df <- data.frame(date, time, stringsAsFactors = F)
> df
date time
1 1999.04.08 11:00
2 1999.04.08 12:00
> str(df)
'data.frame': 2 obs. of 2 variables:
$ date: chr "1999.04.08" "1999.04.08"
$ time: chr "11:00" "12:00"
Don't forget to use str
to understand the data type(s) you are dealing with. That can and will greatly influence the answer to your question. Looking at the help description of function strptime
, we see the following definition:
strptime
converts character vectors to class "POSIXlt": its input x is first converted by as.character. Each input string is processed as far as necessary for the format specified: any trailing characters are ignored.
So, let's break down your code:
df1 <- within(data,
{ timestamp = strptime(paste(date, time),
"%Y/%m/%d%H:%M:%S")
})
First, the paste
function:
> paste(date[1], time[1])
[1] "1999.04.08 11:00"
This generates a character vector with the format above.
Next, the strptime
command.
> strptime(paste(date[1], time[1]), "%Y/%m/%d%H:%M:%S")
[1] NA
Okay, we see an NA. First, be sure to explicitly write format =
, if it reads as tedious, then you should not be having any problems writing flawless code that you will remember forever. Looking at the help code we see:
x <- c("1jan1960", "2jan1960", "31mar1960", "30jul1960")
z <- strptime(x, "%d%b%Y")
> z
[1] "1960-01-01 PST" "1960-01-02 PST" "1960-03-31 PST" "1960-07-30 PDT"
Notice the help section also defines upper/lower case Y
, and the same with the month and date variables. In your case, you are trying to extract something of the following form: YYYY/mm/ddHH:MM:SS
, such as 2017/20/1111:28:30
. Do you see the issue now?
Using your string extraction attempt, we modify it slightly to get the format you are looking for:
> strptime(paste(date, time), format = "%Y.%m.%d %H:%M")
[1] "1999-04-08 11:00:00 PDT" "1999-04-08 12:00:00 PDT"
Putting it all together you get:
> df1 <- within(df, {timestamp = strptime(paste(date, time), format = "%Y.%m.%d %H:%M")})
> str(df1)
'data.frame': 2 obs. of 3 variables:
$ date : chr "1999.04.08" "1999.04.08"
$ time : chr "11:00" "12:00"
$ timestamp: POSIXlt, format: "1999-04-08 11:00:00" "1999-04-08 12:00:00"
> df1
date time timestamp
1 1999.04.08 11:00 1999-04-08 11:00:00
2 1999.04.08 12:00 1999-04-08 12:00:00
Oh yeah, and try out the dplyr
package.
library(dplyr)
> df %>%
mutate(ts = as.POSIXct(paste(date,time),
format = "%Y.%m.%d %H:%M"))
date time ts
1 1999.04.08 11:00 1999-04-08 11:00:00
2 1999.04.08 12:00 1999-04-08 12:00:00