0

My dataframe has this structure:

 str(marc)
 $ Data   : Date, format: "2015-10-31" "2015-10-31" "2015-10-31" ...
 $ Hora   :Class 'times'  atomic [1:351] 0.792 0.792 0.792 0.792 0.5 ...
 .. ..- attr(*, "format")= chr "h:m:s"

I am trying to create a new column joining Data and Hora:

marc$Timestamp=as.POSIXct(paste(marc$Data, marc$Hora), format = "%Y-%m-%d %H:%M:%S")

But as.POSIXct is returning NAs.

$ Timestamp: POSIXct, format: NA NA NA ...

I used the same process to create a Timestamp with other dataframe and it have worked. What I am doing wrong this time? Thank you very much!

> dput(marc$Hora)
structure(c(0.791666666666667, 0.791666666666667, 0.791666666666667, 
0.791666666666667, 0.5, 0.833333333333333, 0.833333333333333, 
0.833333333333333, 0.708333333333333, 0.833333333333333, 0.708333333333333, 
0.708333333333333, 0.604166666666667, 0.604166666666667, 0.604166666666667, 
0.708333333333333, 0.8125, 0.75, 0.541666666666667, 0.75, 0.541666666666667, 
0.541666666666667, 0.541666666666667, 0.8125, 0.8125, 0.520833333333333, 
0.8125, 0.8875, 0.9375, 0.9375, 0.9375, 0.8875, 0.895833333333333, 
...
 format = "h:m:s", class = "times")

Before use POSIXct, I ran:

marc$Hora=times(marc$Hora)

Hora should be H:M:S, but it didn't change

  • The `Hora` is different `class`. It is not in the `HMS` format. can you post a small example with `dput` by editing the post. Also check https://stackoverflow.com/questions/14483629/how-convert-decimal-to-posix-time – akrun Nov 03 '18 at 20:47
  • Thanks @akrun! I edited my post. I think it is some problem with time format! – Fernanda Silva Nov 03 '18 at 21:09
  • Your dput is not full, but I copied the ones you posted. Did you used any particular library to create this. I am not getting any NA `as.POSIXct(paste(Sys.Date(), v1))[1:5] [1] "2018-11-03 19:00:00 EDT" "2018-11-03 19:00:00 EDT" "2018-11-03 19:00:00 EDT" "2018-11-03 19:00:00 EDT" [5] "2018-11-03 12:00:00 EDT"`. In my case, I don't have the decimals. It is in the format `v1[1:5]# [1] 19:00:00 19:00:00 19:00:00 19:00:00 12:00:00` – akrun Nov 03 '18 at 21:14
  • I would use `dput(head(matc$Hora))` for a small example – akrun Nov 03 '18 at 21:15
  • Yes, @akrun. I imported a Excel file about 8000000 rows. It is the reason my dput is not full. – Fernanda Silva Nov 03 '18 at 21:21
  • As I said `dput(head(matc$Hora))` subsets the data and get the structure for those. `dput` gives the exact structure you have for others to test it. An incomplete dput will limit that and others will lose some attributes in the structure. BTW, what package you used other than `base R` – akrun Nov 03 '18 at 21:23
  • > dput(head(marc$Hora)) structure(c(0.791666666666667, 0.791666666666667, 0.791666666666667, 0.791666666666667, 0.5, 0.833333333333333), format = "h:m:s", class = "times") Sorry if is a stupid question. I am starting to use R. I imported my data using read.xlsx and used function "times" from Chron trying to make time (Hora) in H:M:S, but it remained as a decimal – Fernanda Silva Nov 03 '18 at 21:38
  • If I assign your dput to `v1` `v1# [1] 19:00:00 19:00:00 19:00:00 19:00:00 12:00:00 20:00:00` Are you getting this format? and converting it to numeric gives `as.numeric(v1)# [1] 0.7916667 0.7916667 0.7916667 0.7916667 0.5000000 0.8333333` – akrun Nov 03 '18 at 21:41
  • I can't reproduce the issue `as.POSIXct(paste(Sys.Date(), v1))# [1] "2018-11-03 19:00:00 EDT" "2018-11-03 19:00:00 EDT" "2018-11-03 19:00:00 EDT" "2018-11-03 19:00:00 EDT" [5] "2018-11-03 12:00:00 EDT" "2018-11-03 20:00:00 EDT"` – akrun Nov 03 '18 at 21:43
  • It's very strange. I do not have HMS format. I also tryed to assign my dput to V1 and had [1] 0.7916667 0.7916667 0.7916667 0.7916667 0.5000000 0.8333333. If I use times(v1), it returns [1] 19:00:00 19:00:00 19:00:00 19:00:00 12:00:00 20:00:00. But if I use times in my dataframe marc$Hora, it is still decimal and doesn't change to hms. – Fernanda Silva Nov 03 '18 at 22:14
  • 1
    @akrun, thank you very much for spend you time helping! I found that was a problem with Hora in the 247th row. Now it was fixed! – Fernanda Silva Nov 03 '18 at 23:30

1 Answers1

0

I thought I recognized that class and format as coming from package chron (not "Chron"):

library(chron)
 Hora

# [1] 19:00:00 19:00:00 19:00:00 19:00:00 12:00:00 20:00:00 20:00:00 20:00:00 17:00:00
#[10] 20:00:00 17:00:00 17:00:00 14:30:00 14:30:00 14:30:00 17:00:00 19:30:00 18:00:00
#[19] 13:00:00 18:00:00 13:00:00 13:00:00 13:00:00 19:30:00 19:30:00 12:30:00 19:30:00
#[28] 21:18:00 22:30:00 22:30:00 22:30:00 21:18:00 21:30:00

So they are values that are at the hour boundaries. If you build an example using the str information with that`knowledge:

 marc <- data.frame( Data =as.Date("2015-10-31", "2015-10-31", "2015-10-31"),
 Hora=structure(c( 0.792, 0.792, 0.792),format = "h:m:s", class = "times"))
 marc
#----------------
        Data     Hora
1 2018-11-03 19:00:29
2 2018-11-03 19:00:29
3 2018-11-03 19:00:29

The seconds are off a bit by virtue of rounding. At any rate you should coerce the Hora values to character before paste-ing:

as.POSIXct( paste(marc$Data,  as.character(marc$Hora) ) )
#[1] "2018-11-03 19:00:29 PDT" "2018-11-03 19:00:29 PDT" "2018-11-03 19:00:29 PDT"
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Thank you very much, @42-, but there was a problem with my raw data, on the 247th row. Because of this, all the functions worked with the vector I provided, but not in my dataframe. Now it was fixed. – Fernanda Silva Nov 04 '18 at 15:54