0

I have a csv file with several time stamps. When I tried to import the file into RStudio, part of column is imported as strings like "2022/12/1 11:07" although the original data has section of seconds, i.e. "2022/12/1 11:07**:00**"

Additionally, other data in the same dataset are imported without this truncation.

This inconsistency makes my task a little complicated because I'm going to use lubridate::ymd_hms function for this data. I've also tried to import data with read.csv, which didn't make any difference.

Does anyone know how to avoid this phenomenon?

Thanks.

Here is dput of original csv (from clipboard). This is from the column which was imported including seconds.

structure(list(V1 = c("2022/9/8", "2022/9/8", "2022/9/8"), V2 = c("12:57", 
"13:00", "13:30")), class = "data.frame", row. Names = c(NA, -3L
))

And data below are from the column which drops seconds when imported by read_csv.

structure(list(V1 = c("2022/9/8", "2022/9/8", "2022/9/8"), V2 = c("12:57", 
"12:57", "12:57"), V3 = c("2022/9/9", "2022/9/9", "2022/9/9"), 
    V4 = c("10:35", "10:35", "10:35")), class = "data.frame", row.names = c(NA, 
-3L))

(It seems to have been separated cells at day and hour because of the space) These two kinds of data show different formats after importing to RStudio. Although it seems as if without seconds, I can see it has information of seconds (All are :00) in the formula bar of Excel.

Isaiah
  • 2,091
  • 3
  • 19
  • 28
KintensT
  • 7
  • 2
  • 2
    How are you defining the column type? Are you sure those data are missing and it's not the way the data are being printed? – Santiago Dec 01 '22 at 02:31
  • 2
    We cannot really help given what we have here: no code, an iota of data, but nothing really reproducible. Please make this question *reproducible*. This includes sample code you've attempted (including listing non-base R packages, and any errors/warnings received), sample *unambiguous* data (e.g., `data.frame(x=...,y=...)` or the output from `dput(head(x))`), and intended output given that input. Refs: https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info. – r2evans Dec 01 '22 at 02:45
  • It would be helpful if you could provide some examples; are they all formatted exactly like `2022/12/1 11:07**:00**`? If so, you might precede the `ymd_hms` with a step to remove the `**`'s. – Jon Spring Dec 01 '22 at 03:14
  • Sorry for asking without example data. However I can confirm there is the last ":00" in original csv file opened in Excel. How can I describe what is the format of a column in problem..? – KintensT Dec 01 '22 at 03:37
  • 1
    If your data only has minutes, use `lubridate::ymd_hm` and that will create a datetime with seconds (all :00 of course). – Jon Spring Dec 01 '22 at 03:47
  • 1
    You are allowed and encouraged to edit your question to be clearer, e.g. by including sample data. – Jon Spring Dec 01 '22 at 03:49
  • 1
    I edited dput of clipboard from the part of original dataset. – KintensT Dec 01 '22 at 05:33
  • @JonSpring I could advance to handle the data using `ymd_hm` for values without seconds. Thank you. – KintensT Dec 01 '22 at 05:44

1 Answers1

0

If your data as loaded has hours and minutes, you can use lubridate::ymd_hm to make a datetime column (which will have seconds -- zero -- in each case).

df1$timestamp1 = lubridate::ymd_hm(paste(df1$V1, df1$V2))


df1
#        V1    V2       V3    V4          timestamp1
#1 2022/9/8 12:57 2022/9/9 10:35 2022-09-08 12:57:00
#2 2022/9/8 12:57 2022/9/9 10:35 2022-09-08 12:57:00
#3 2022/9/8 12:57 2022/9/9 10:35 2022-09-08 12:57:00
Jon Spring
  • 55,165
  • 4
  • 35
  • 53