How to denormalize nested list in R?

Question

I'd like to find a clean and readable way to convert a structure like src (parsed from JSON, that's why there are a ton of nested lists) to one like dst.

src <- list(
    sessions=list(
        list(
            statistics=list(
                list(
                    list(
                        round=0,
                        diff=3,
                        saldo=3
                    ),
                    list(
                        round=1,
                        diff=-1,
                        saldo=2
                    ),
                    list(
                        round=2,
                        diff=-1,
                        saldo=1
                    )
                ),
                list(
                    list(
                        round=0,
                        diff=-1,
                        saldo=-1
                    )
                )
            ),
            sessionProp="sv1"
        ),
        list(
            statistics=list(
                list(
                    list(
                        round=0,
                        diff=4,
                        saldo=4
                    )
                ),
                list(
                    list(
                        round=0,
                        diff=2,
                        saldo=2
                    )
                )
            ),
            sessionProp="sv2"
        )
    ),
    packageProps=list(
        rules=list(
            list(
                name="Ruleset 1",
                ruleProp="rv1"
            ),
            list(
                name="Ruleset 2",
                ruleProp="rv2"
            )
        ),
        packageProp="package prop value"
    )
)

dst <- data.frame(
    round=c(0,1,2,0,0,0),
    diff=c(3,-1,-1,-1,4,2),
    saldo=c(3,2,1,-1,4,2),
    sessionProp=c("sv1","sv1","sv1","sv1","sv2","sv2"),
    ruleProp=c("rv1","rv1","rv1","rv2","rv1","rv2")
)

Both sessions have two elements in statistics list. Those correspond to two elements in rules list packageProps. That's the only difference from pretty much direct denormalization.

score 3 · Accepted Answer · answered Sep 20 '15 at 20:32

Give this a shot on the larger list (I'm assuming there are either more or larger versions of this):

do.call(rbind.data.frame, lapply(1:length(src$sessions), function(i) {
  dat <- do.call(rbind.data.frame, 
                 lapply(unlist(src$sessions[[i]]$statistics, recursive=FALSE), 
                        rbind.data.frame))
  dat$sessionProp <- src$sessions[[i]]$sessionProp
  dat$ruleProp <- src$packageProps$rules[[i]]$ruleProp
  dat
}))

##     round diff saldo sessionProp ruleProp
## 2       0    3     3         sv1      rv1
## 21      1   -1     2         sv1      rv1
## 22      2   -1     1         sv1      rv1
## 23      0   -1    -1         sv1      rv1
## 24      0    4     4         sv2      rv2
## 211     0    2     2         sv2      rv2

You can nuke the row names if desired. If you use dplyr's bind_rows vs do.call(rbind… it only saves a teensy bit of typing, but will also auto-nuke the row names. I'm hoping others can find an even more optimal solution, though.

Now I'm feeling dumb for missing `unlist` and `rbind.data.frame`! Now I know how to pull this off. The result is slightly off, though. You are correlating `rules` elements to `sessions` (coincidentally there are two of them too), not `statistics`. — Tero Tilus, Sep 21 '15 at 03:59

How to denormalize nested list in R?

1 Answers1

Linked