1

I'm looking for an R function that would do type conversion conversion for a data frame using the list of desired column types as an argument, i.e. something like

df = data.frame(x="1", y="2")
df2 = convert(df, col.types=c("integer", "character"))

where df2 columns would have the specified types. I played with type_convert function from tidyverse which is close to what I want, but not quite it. For example, it seems to require column names

df2 = type_convert(df, col_types=cols(x=col_integer(),y=col_character())

which I can't provide in advance (it doesn't throw an error without column names, but doesn't do what I want it to do, either). Also, I would like to specify column types as a character vector (like read.csv functions do) and not in the cumbersome way type_convert does it with col_ functions.

My end goal is to extend read.csv functionality by parsing files that aren't rectangular, but consist of rectangular blocks, e.g. something like

2019-10-20 13:09:10 x: 1 S 16 y: 10 25 35 600 final

in a text file should be read into a data frame with columns

t="2019-10-20 13:09:10", x1=1, x2="S", x3=16, y1=10, y2=25, y3=35, y4=600, y5="final"

Column names and types will be given at run time, I just don't want them hardcoded so I'd end up with dozens of distinct R functions, 1 for each distinct format.

If there is code that already does that, please let me know.

Thanks!

Best regards, Nikolai

Nikolai
  • 65
  • 5

1 Answers1

1

There is a simple solution in the package hablar

library(hablar)
library(dplyr)
df <- data.frame(x="1", y="2", z = "4")

df %>% 
  convert(int(x, z),
          chr(y))

You can simply put multiple column names to convert multiple columns, e.g. z and z to integer as in the example above.

davsjob
  • 1,882
  • 15
  • 10