0
  • I am importing an excel csv file which has a large number of columns. Each column is for a different date. e.g. March 1990, April 1990.
  • When I import the column headers are being changed to numbers, for example, 34355, 34356.
  • How do I preserve the dates?

I tried using the r studio import function

sales <- read_csv("W:/Sales_data/sales.csv")

Expected

First_Name Sir_name Region Jan_1980 Feb_1980 Mar_1980

George      Dell    LA      52        23      121
Lisa        Stevens NY                234     122
Peter       Hunt    TX      3242      12      123

Actual

First_Name Sir_name Region  34524    34525    34526

George      Dell    LA      52        23      121
Lisa        Stevens NY                234     122
Peter       Hunt    TX      3242      12      123

Any help is greatly appreciated.

M--
  • 25,431
  • 8
  • 61
  • 93
CISCO
  • 539
  • 1
  • 4
  • 14
  • 2
    you might want to look here: https://stackoverflow.com/questions/43230470/how-to-convert-excel-date-format-to-proper-date-with-lubridate The "problem" is on the excel side. – Justin In Oz Jun 17 '19 at 01:32
  • This kind of problems are almost always on the excel side :( – PavoDive Jun 17 '19 at 01:47
  • 1
    @PavoDive Well, excel has its own issues but it's very much expected to do some data cleaning when importing a dataset. – M-- Jun 17 '19 at 01:50
  • How is this happening in the first place? A csv file is just plain text. Typing "Jan_1980" into Excel also doesn't convert to a date from my testing. Even if you enter "Jan-1980" in Excel and make it auto-convert to a date, saving to a csv file will still just keep this as plain text "Jan-80" in the output text file. `read_csv` will then deal with this properly too and import the header correctly. – thelatemail Jun 17 '19 at 05:38

1 Answers1

1

You need to import the first as data and not headers. Then, change the format to match your desired. Finally, assign the first row as column names and remove it next.

library(readr)
sales <- read_csv("W:/Sales_data/sales.csv", 
    col_names = FALSE)

sales[1,4:6] <- format(as.Date(sales[1,4:6], origin = "1899-12-30"), "%b_%Y")

colnames(sales) <- sales[1,]

sales <- sales[-1,]
M--
  • 25,431
  • 8
  • 61
  • 93