1

I have 451 dates in the format "2002-06-18",YYYY-MM-DD, in the spreadsheet program libre office calc. I would like to transfer these dates into R as a column with the name "Date_Sale".

In the next step I copied this column of dates to a text file. In the next step I read this text file into R by the command

Date_Sale <- read.csv("Date_Sale.txt", header=FALSE,stringsAsFactors=FALSE)

> str(Date_Sale)
'data.frame':   451 obs. of  1 variable:
 $ V1: chr  "2002-06-18" "2002-05-22" "2002-05-23" "2002-10-23" ...

Above the command str etc. shows that the data was read as dataframe in the format chr, character, into R. Now I tried to use the command

Date_Sale <- strptime(Date_Sale, "%Y-%m-%d")

There appears the error message

Fehler in strptime(Date_Sale, "%Y-%m-%d") : 
  Eingabe-Zeichenkette ist zu lang

If I use one element in the command above it works.

firstday <- strptime("2002-06-18", "%Y-%m-%d")
[1] "2002-06-18 CEST"
r2evans
  • 141,215
  • 6
  • 77
  • 149
fjeu3
  • 11
  • 2
  • 3
    `Date_Sale` is `data.frame` your variable is `V1`. Try it `Date_Sale$V1 <- strptime(Date_Sale$V1, "%Y-%m-%d")` – utubun Aug 05 '19 at 19:54
  • I tried your code and it seems to work > str(Date_Sale) 'data.frame': 451 obs. of 1 variable: $ V1: POSIXlt, format: "2002-06-18" "2002-05-22" "2002-05-23" "2002-10-23" ... – fjeu3 Aug 06 '19 at 06:20
  • Why does the command read.csv read one vector of 451 elements as data frame ? Is there a more simple way ? – fjeu3 Aug 06 '19 at 06:40
  • Because `read.csv` or more generally `read.table` "... is the principal means of reading tabular data into R..." And your data is in `tabular` format, which is what '.csv' format designed for. If you want to read simply vectors, read `?scan` there is an explanation of how to write and how to read non-tabular data. – utubun Aug 06 '19 at 06:57
  • > is.ts(Date_Sale$V1) [1] FALSE Why this variable is no time series ? – fjeu3 Aug 06 '19 at 07:09
  • If I call this vector the values seem to be correct inclusive the first element. But if I try to use range(Date_PL$V1) there comes [1] "1930-06-18 CET" "2013-07-08 CEST" ; the first value is incorrect and the time zones change between CET and CEST – fjeu3 Aug 06 '19 at 07:20
  • If I try scan it prompts Fehler in scan() : scan() erwartete 'a real', bekam '2002-06-18' What means "a real" ? – fjeu3 Aug 06 '19 at 07:44
  • it is hard to answer your questions, without having your data. That is the reason why [this post](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) got so many up votes. Regarding the use of `scan()` function. Specify the *class* of your data using `what` argument, e.g: `scan('yourfile.txt', what = 'character')`. And probably you should specify the `sep` argument, if your dates separated e.g. by `,` or `\n` etc. – utubun Aug 06 '19 at 08:04
  • Yes, and it through error, because by default it expect that you `scan` file with *real numbers*, which in R are stored in *double precision format*. Default value of `what` argument of `scan()` is `double()`. Changing this to `character` must resolve the problem. e.g. `scan(text = "2019-05-08 2018-08-09", what = 'character')` and `scan(text = "2019-05-08,2018-08-09", what = 'character', sep = ',')` works perfectly fine for me (don't forget to use `file = 'yourfile.txt` with path to your file instead of `text = ` while working with your data). – utubun Aug 06 '19 at 08:20

1 Answers1

0

Here is one approach

library(tidyverse)
df <- tribble(~my_date, 
              "2002-06-18",
              "2002-05-22",
              "2002-05-23",
              "2002-10-23")


df %>% 
  mutate(my_date = lubridate::ymd(my_date))

or

df %>% 
   mutate(my_date = as.Date(my_date, format = '%Y-%m-%d'))

Be careful with timezones when converting data. strptime will use your current time zone by default which may be summer time (daylight saving time). Check ?strptime

Tony Ladson
  • 3,539
  • 1
  • 23
  • 30
  • Thank you for your help, unfortunately I can not install this package tidyverse ERROR: dependencies ‘xml2’, ‘httr’ are not available for package ‘rvest’ * removing ‘/home/paul/R/x86_64-pc-linux-gnu-library/3.4/rvest’ Warning in install.packages : installation of package ‘rvest’ had non-zero exit status ERROR: dependencies ‘httr’, ‘rvest’, ‘xml2’ are not available for package ‘tidyverse’ – fjeu3 Aug 06 '19 at 07:38
  • use `install.packages('tidyverse', dependencies = TRUE)` it will install packages `tidyverse` depends on, and `tidyverse` itself. – utubun Aug 06 '19 at 08:08
  • Try installing the package individually e.g. `install.packages('rvest')` – Tony Ladson Aug 07 '19 at 21:57
  • what does ~mydate mean ? – fjeu3 Aug 08 '19 at 07:00
  • This is the name of the column. I just called in my_date. In your data it was V1. – Tony Ladson Aug 08 '19 at 23:30