1

I have this code:

library(jsonlite)

df <- fromJSON('blarg.json')

from this json (in a file called blarg.json):

[{  "id": 211,
    "sub_question_skus": {  "0": 329, "behavior": 216 } },
 {  "id": 333,
    "sub_question_skus": [  340, 341 ] },
 {  "id": 345,
    "sub_question_skus": [  346, 352 ] },
 {  "id": 444,
    "sub_question_skus": null }]

That produces a data frame like so:

> df
   id sub_question_skus
1 211          329, 216
2 333          340, 341
3 345          346, 352
4 444              NULL

Ah, but look, its structure is quite complicated in the RStudio viewer:

enter image description here

I want something like:

df_expanded <- data.frame(id=c(211, 211, 333, 333, 345, 345),
                          sub_question_sku=c(329,216,340,341,346,352))
> df_expanded
   id sub_question_sku
1 211              329
2 211              216
3 333              340
4 333              341
5 345              346
6 345              352

How do I get that?

For context, I'm trying to update rsurveygizmo to handle sub-questions from Survey Gizmo. It's uphill going for me.

dfrankow
  • 20,191
  • 41
  • 152
  • 214
  • This is reproducible. Simply save the json in a file. I don't know how to get a vector into a date frame column, so I couldn't construct an example without the file and jsonlite. – dfrankow Apr 01 '20 at 02:55

1 Answers1

1

Hacky, but a start:

df$sub_question_skus <- replace(
  df$sub_question_skus,
  sapply(df$sub_question_skus, is.null), NA)

as.data.frame(
  do.call(
    rbind,
    Map(f=cbind, id=df$id, sub=df$sub_question_skus)),
  row.names = FALSE)
#    id sub
# 1 211 329
# 2 211 216
# 3 333 340
# 4 333 341
# 5 345 346
# 6 345 352
# 7 444  NA
dfrankow
  • 20,191
  • 41
  • 152
  • 214
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • Thanks, perfect. On a related note, how do people actually deal with output from jsonlite? It seems so awkward. I found https://github.com/sailthru/tidyjson, maybe that's something. – dfrankow Apr 01 '20 at 13:58
  • 1
    It depends on the data. For instance, "well-formatted simple data" almost always comes out as a `data.frame`, `list`, or `vector`, in which case there is no pain associated with it. Some well-formatted not-simple structures often come fairly clean as well, though obviously with a little assembly/reshaping required. And to be clear, the issue is not `jsonlite`, it's with the per-project "need" to shape the data in such a way that it is non-rectangular (frame, matrix) and non-linear (list, vector). – r2evans Apr 01 '20 at 14:37