2

I am importing an Excel file into R, where the date format in Excel is "27-02-2012". However, once I import the dataset into R with the code below:

#Loading packages
library(tidyverse)
library(readxl)
library(writexl)
library(stringr)
library(textclean)
library(lubridate)
library(zoo)

import data

data_corpus <- read_excel("data.xlsx",
                                   sheet= "xyz")

The date format in some rows stays as "27-02-2012", while other rows look as follows "40911".

Is it possible to convert all values under the "date" column to have the following format: "27-02-2012"?

Here is a data exmaple:

sapply(data_corpus, class)

output:

   post                      date                  
  "character"               "character" 

I have tried the following code, but it turns all values in "date" into NAs:

data_corpus$date <- as_date(data_corpus$date)

Sample:

data_corpus$post[2]
[1] this is really unfortunateا"
> data_corpus$date[2]
[1] "27-02-2012"
nesta13
  • 95
  • 6
  • 1
    Potential duplicate here https://stackoverflow.com/questions/43230470/how-to-convert-excel-date-format-to-proper-date-in-r – Harrison Jones Apr 08 '22 at 20:33
  • 1
    It looks like the "date" column in Excel is a mixture of dates and character strings representing dates. It might be easier to edit the Excel file to ensure the whole column is in the same format. In R you will have to separate the numbers from the strings and follow the link in the previous comment to covert the number and read the documentation for `as.Date()` & `strptime()` functions. – Dave2e Apr 09 '22 at 04:26

1 Answers1

3

Try using the col_types parameter

data_corpus <- read_excel("data.xlsx", sheet= "xyz", col_types = c("text", "date"))
Marcus
  • 3,478
  • 1
  • 7
  • 16