I am importing an excel file from the Robert Koch Institute into R using readxl::read_excel
. However, I am experiencing problems with a row that contains only dates. The date format is DD.MM.YYYY, but in the resulting data frame, they appear as seemingly arbitrary numeric values. Here is the code that I am using:
library(readxl)
df <- read_excel("path/to/Fallzahlen_Kum_Tab_Archiv.xlsx",
sheet = "BL_7-Tage-Fallzahlen (fixiert)")
head(df)
which returns the following output:
# A tibble: 6 x 494
`Fallzahlen in d~ ...2 ...3 ...4 ...5 ...6 ...7 ...8 ...9 ...10 ...11 ...12 ...13 ...14 ...15
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
3 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
4 NA 43957 43958 43959 43960 43961 43962 43963 43964 43965 43966 43967 43968 43969 43970
5 Baden-Württemberg 908 779 665 706 772 760 774 744 765 725 622 551 549 483
6 Bayern 1318 1246 1182 1223 1148 1151 1198 1194 1115 1022 1022 997 982 921
# ... with 479 more variables: ...16 <dbl>, ...
In row number 4, you can see the dates that are clearly not imported correctly. I already tried col_types="text"
and col_types="list"
, as this suggested in other posts, but this doesn't seem to change anything.
The excel file can be downloaded in the "Archiv" section of RKI's webpage and this should be the direct download link.