2

I am new to importing .json files for use in R. I'm trying to create a 'long' format dataframe - each row is one participant, each column is one variable. Most of my dataset is compatible after calling fromJSON, but one nested json structure results in a ragged list, with Null, 1, 2, or 3 entries for each participant (in theory there could be more).

Sample:

testdf <- fromJSON("[[\"MMM\",\"AAA\"],null,[\"GGG\",\"CCC\",\"NNN \"],null,null,[\"AAA\",\"NNN \"],null,[\"MMM\",\"AAA\"],null,null,null,null,[\"MMM\",\"AAA\"],[\"CCC\",\"AAA\"],\"NNN \",[\"MMM\",\"NNN \",\"EEE\"],null,null,[\"CCC\",\"MMM\",\"AAA\"],[\"HHH\",\"AAA\"],\"AAA\",[\"MMM\",\"AAA\",\"NNN \"],[\"CCC\",\"AAA\"],[\"MMM\",\"AAA\",\"NNN \"],[\"AAA\",\"NNN \"],[\"MMM\",\"AAA\"],null,null,null,null,null,null]", flatten=TRUE)

How can I transform this list into a 32 x n dataframe which preserves the null values?

Variations on unlist remove the null values; rbind.fill moves entries to the next row, of course - could something like cbind.fill work? (cbind a df with an empty df (cbind.fill?)) Something hidden in plyr?

Thanks for any suggestions.

Community
  • 1
  • 1
bbe415
  • 25
  • 6
  • I think `as.data.frame(t(mapply(function(x, y) c(x, rep(NA, max(sapply(testdf, length)) - y)), testdf, sapply(testdf, length))))` will do it. Not pretty, but should work. – jbaums Jul 16 '14 at 00:54

1 Answers1

0

Fairly straightforward:

t(sapply(testdf, function(x) { 
  if (is.null(x)) x <- NA_character_ 
  length(x) <- 3
  x })
)

If you want to choose the number of columns automatically, then you need to calculate that first:

nc <- max(sapply(testdf, length))
t(sapply(testdf, function(x) { 
  if (is.null(x)) x <- NA_character_ 
  length(x) <- nc
  x })
)
Gabor Csardi
  • 10,705
  • 1
  • 36
  • 53