0

I am new to R facing issue in converting json file to dataframe. I have the json file which looks like below:

json_file = '[{"id": "abc", "model": "honda", "date": "20190604", "cols": {"action": 15, "values": 18, "not": 29}},
  {"id": "abc", "model": "honda", "date": "20190604", "cols": {"hello": 14, "hi": 85, "wow": 14}},
  {"id": "mno", "model": "ford", "date": "20190604", "cols": {"yesterday": 21, "today": 21, "tomorrow": 29}},
  {"id": "mno", "model": "ford", "date": "20190604", "cols": {"docs": 25, "ok": 87, "none": 42}}]'

I want to convert the above json file to dataframe in below format:

EXPECTED RESULT

df = 
id  model      date  cols  values_cols
abc honda  20190604 action   15   
abc honda  20190604 values   18 
abc honda  20190604 not      29 
abc honda  20190604 hello    14 
abc honda  20190604 hi       85 
abc honda  20190604 wow      14 
mno ford  20190604 yesterday 21   
mno ford  20190604 today     21 
mno ford  20190604 tomorrow  29 
mno ford  20190604 docs      25 
mno ford  20190604 ok        87 

My Result

    id model     date cols id.1 model.1   date.1 cols.1 id.2 model.2   date.2 cols.2 id.3 model.3   date.3 cols.3
action abc honda 20190604   15  abc   honda 20190604     14  mno    ford 20190604     21  mno    ford 20190604     25
values abc honda 20190604   18  abc   honda 20190604     85  mno    ford 20190604     21  mno    ford 20190604     87
not    abc honda 20190604   29  abc   honda 20190604     14  mno    ford 20190604     29  mno    ford 20190604     42
It's not correct, as it is taking as index.

My Solution:

require(RJSONIO)
df = fromJSON(json_file)
user15051990
  • 1,835
  • 2
  • 28
  • 42
  • What have you tried? Which of the JSON-converting packages of CRAN have you installed? – Dirk Eddelbuettel Jun 29 '19 at 18:15
  • You must start and end the string with single quotes: `'[{"id" etc 42}}]'`. Then just `df1 <- jsonlite::fromJSON(json_file)` will do. And there is no need to load package `rjson`, you are not using it. – Rui Barradas Jun 29 '19 at 18:53
  • @RuiBarradas, But I am not getting the expected result. I have updated solution. Please have a look. – user15051990 Jun 29 '19 at 21:02

1 Answers1

0

The problem when reading in the data with jsonlite::fromJSON is that the last column is a dataframe, not an atomic vector.

tmp <- jsonlite::fromJSON(json_file)
str(tmp)
#'data.frame':   4 obs. of  4 variables:
# $ id   : chr  "abc" "abc" "mno" "mno"
# $ model: chr  "honda" "honda" "ford" "ford"
# $ date : chr  "20190604" "20190604" "20190604" "20190604"
# $ cols :'data.frame':  4 obs. of  12 variables:
#  ..$ action   : int  15 NA NA NA
#  ..$ values   : int  18 NA NA NA
#  ..$ not      : int  29 NA NA NA
#  ..$ hello    : int  NA 14 NA NA
#  ..$ hi       : int  NA 85 NA NA
#  ..$ wow      : int  NA 14 NA NA
#  ..$ yesterday: int  NA NA 21 NA
#  ..$ today    : int  NA NA 21 NA
#  ..$ tomorrow : int  NA NA 29 NA
#  ..$ docs     : int  NA NA NA 25
#  ..$ ok       : int  NA NA NA 87
#  ..$ none     : int  NA NA NA 42

So the last column must be cbinded with the other 3 columns before reshaping the data from wide format to long format.

tmp <- cbind(tmp[-4], tmp[[4]])
df1 <- reshape2::melt(tmp, id.vars = c("id", "model", "date"))
names(df1)[4:5] <- c("cols", "values_cols")
df1 <- df1[complete.cases(df1), ]
row.names(df1) <- NULL

df1
#    id model     date      cols values_cols
#1  abc honda 20190604    action          15
#2  abc honda 20190604    values          18
#3  abc honda 20190604       not          29
#4  abc honda 20190604     hello          14
#5  abc honda 20190604        hi          85
#6  abc honda 20190604       wow          14
#7  mno  ford 20190604 yesterday          21
#8  mno  ford 20190604     today          21
#9  mno  ford 20190604  tomorrow          29
#10 mno  ford 20190604      docs          25
#11 mno  ford 20190604        ok          87
#12 mno  ford 20190604      none          42

Now clean up the .GlobalEnv.

rm(tmp)    # no longer needed.
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66