2

I have a tibble with lots of columns. I don't want to change them one by one. Let's say that tible looks like this:

df <- tibble(
  x = c(1,0,1,1,'a'), 
  y = c('A', 'B', 1, 'D', 'A'), 
  z = c(1/3, 4, 5/7, 100, 3)
)

I want to convert their column types based on value in other tibble:

df_map <- tibble(
  col = c('x','y','z'), 
  col_type = c('integer', 'string', 'float')
)

What's the most appropriate solution?

mihagazvoda
  • 1,057
  • 13
  • 23

2 Answers2

5

Try the following:

library(purrr)
map2_dfc(df, df_map$col_type, type.convert, as.is = T)

This code assumes that df_map$col is in the same order as names(df) (thanks to @Moody_Mudskipper for pointing that out).

As @NelsonGon points out, the appropriate data types in R would be "integer", "character" and "double".

Edit to include the prior modification of boolean variables, as requested in the comment:

library(tidyverse)
df %>% 
  mutate_if(~identical(sort(unique(.)), c(1,2)), ~{. - 1}) %>% 
  map2_dfc(df_map$col_type, type.convert, as.is = T)
shs
  • 3,683
  • 1
  • 6
  • 34
  • 1
    Just want to point out that since `map2_dfc` is from the `purrr` package, you can just load that instead of everything that is loaded as the tidyverse – camille Aug 18 '19 at 16:12
  • shs nice, thanks! Is it possible to also include user-defined functions here? I have some columns which have values 1, 2 as booleans and should be subtracted by 1 before the conversion? – mihagazvoda Aug 18 '19 at 17:48
  • You could modify those variables beforehand: `dplyr::mutate_if(df, ~identical(sort(unique(.)), c(1,2)), ~{. - 1})` – shs Aug 18 '19 at 18:10
  • 1
    `df_map$col` is not used so it assumes `df`'s cols is sorted by `df_map$col` – moodymudskipper Aug 19 '19 at 13:54
  • That is true, my code only works properly if `df_map$col` is in the same order as `names(df)`. I will make an edit to reflect that – shs Aug 19 '19 at 14:08
3

I would use the package readr for such task, it's part of tidyverse

suppressPackageStartupMessages(library(tidyverse))

# rework your col types to be compatible with ?readr::cols
df_map$col_type <- recode(df_map$col_type, integer = "i", float = "d" , string = "c")

# make a vector out of df_map
vec_map <- deframe(df_map)
vec_map
#>   x   y   z 
#> "i" "c" "d"

# convert according to your specs
type_convert(df,exec(cols, !!!vec_map))
#> Warning in type_convert_col(char_cols[[i]], specs$cols[[i]],
#> which(is_character)[i], : [4, 1]: expected an integer, but got 'a'
#> # A tibble: 5 x 3
#>       x y           z
#>   <int> <chr>   <dbl>
#> 1     1 A       0.333
#> 2     0 B       4    
#> 3     1 1       0.714
#> 4     1 D     100    
#> 5    NA A       3
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167