1

I need to union a result set into this data.frame, in order to make sure I have the columns in place, even if the result set does not contain them. This is for the purposes of writing to a MySQL DB later on.

dbNames <- c('a','b','c','d')
emptyTableOut <- data.frame(
  cbind(
    matrix(character(), ncol = 1, nrow = 0), # needs to be char
    matrix(integer(), ncol = 3, nrow = 0) # needs to be int
  ), stringsAsFactors = FALSE) %>% 
  setNames(nm = c(dbNames))
> glimpse(emptyTableOut)
Observations: 0
Variables: 4
$ a <chr> 
$ b <chr> 
$ c <chr> 
$ d <chr> 

How can I do this in a way that doesn't coerce the ints to chars?

This question is different than the already posted answers because I have a huge number of columns, not the few implied by this minimally reproducible example.

d8aninja
  • 3,233
  • 4
  • 36
  • 60
  • 2
    `data.frame(a = character(0),b = integer(0),c = integer(0),d = integer(0),stringsAsFactors = F)`. The persistence of `cbind(matrix())` stuff in creating data frames continues to amaze me. It just won't die! ;) – joran Oct 16 '18 at 22:09
  • You are way overthinking this. Why are you making those `matrix` calls. Just do `data.frame(a = character(),b = integer(),c = logical(),stringsAsFactors = FALSE)` – OganM Oct 16 '18 at 22:10
  • Or `setNames(data.frame(character(0),integer(0),stringsAsFactors = F),c('a','b'))` if you need to set the column names programmatically. – joran Oct 16 '18 at 22:11
  • @joran this question is different and (don't want) to do what you're saying because I have a huge number of columns (see edit). is it really most efficient to say `character(), integer(),integer(),integer(),integer(),integer(),integer(),integer(),integer(),integer(),integer(),integer(),integer(),integer()`? – d8aninja Oct 16 '18 at 22:24
  • There are other good options at the duplicate: subsetting an existing df, a clever use of read.table. – joran Oct 16 '18 at 22:28
  • @joran this did exactly what i want `emptyTableOut <- data.frame(character(), matrix(integer(), ncol = 3, nrow = 0), stringsAsFactors = FALSE) %>% setNames(nm = c(dbNames))`. if Rich Scriven reopens I will answer it myself – d8aninja Oct 16 '18 at 22:30
  • 5
    I would add it as an answer at the linked duplicate. – joran Oct 16 '18 at 22:33
  • 1
    For people finding this in the future, the reason this problem occurs is that a `matrix` can only have a single data type. When you `cbind` 2 matrices, the result is still a `matrix` and so the variables are all coerced into a single type before converting to a `data.frame`. – divibisan Oct 17 '18 at 14:15

0 Answers0