I have a data frame which consists of a single column of some very messy JSON data. I would like to convert the JSON entries in that column to additional columns in the same data frame, I have a messy solution, but it will be tedious and long to apply it to my actual dataset.
Here is my sample data frame:
sample.df <- data.frame(id = c(101, 102, 103, 104),
json_col = c('[{"foo_a":"bar"}]',
'[{"foo_a":"bar","foo_b":"bar"}]',
'[{"foo_a":"bar","foo_c":2}]',
'[{"foo_a":"bar","foo_b":"bar","foo_c":2,"nested_col":{"foo_d":"bar","foo_e":3}}]'),
startdate = as.Date(c('2010-11-1','2008-3-25','2007-3-14','2006-2-21')))
in reality my data frame has over 100000 entries and consists of multiple JSON columns where I need to apply the solution to this question, there are also several orders of nested lists (i.e. nested lists within nested lists).
Here is my solution:
j.col <- sample.df[2]
library(jsonlite)
j.l <- apply(j.col, 1, jsonlite::fromJSON, flatten = T)
library(dplyr)
l.as.df <- bind_rows(lapply(j.l,data.frame))
new.df <- cbind(sample.df$id, l.as.df, sample.df$startdate)
My solution is a roundabout method were I extract the column from the data frame with the JSON stuff, and then convert the JSON into a second dataframe, and then I combine the two dataframes into a third datadframe. This will be long and tedious to do with my actual data, not to mention that it is inelegant. How can I do this without having to create the additional data frames?
Thanks in advance for any help!