0

I am fetching the data from Twitter API. Converting a Data from JSON object to Data Frame and load into Data Warehouse. Find below input and code snippet.

I am very new to R Programming.

 stats_campaign.data <- content(stats_campaign.request)
 print(stats_campaign.data)

O/P:

`{
 "data_type": [ "stats" ],
 "time_series_length": [ 1 ],
 "data": [
 {
  "id": [ "XXXXX" ],
  "id_data": [
    {
      "segment": {},
      "metrics": {
        "impressions": {},
        "tweets_send": {},
        "qualified_impressions": {},
        "follows": {},
        "app_clicks": {},
        "retweets": {},
        "likes": {},
        "engagements": {},
        "clicks": {},
        "card_engagements": {},
        "replies": {},
        "url_clicks": {},
        "carousel_swipes": {}
      }
    }
   ]
   },

   {      
   "id": [ "XXXX1" ],
   "id_data": [
    {
      "segment": {},
      "metrics": {
        "impressions": {},
        "tweets_send": {},
        "qualified_impressions": {},
        "follows": {},
        "app_clicks": {},
        "retweets": {},
        "likes": {},
        "engagements": {},
        "clicks": {},
        "card_engagements": {},
        "replies": {},
        "url_clicks": {},
        "carousel_swipes": {}
      }
    }
    ]
    },`

When I am reading this JSON value ,

    stats_json_file <- sprintf("P:/R Repos/R    
               Applications/TwitterAPIData/stats_test_data-%s.json", TODAY)
    jsonlite::fromJSON(stats_json_file)

   **Result :**
       id                                      id_data
    1  5wcaz                                         NULL
    2  5ub2u                                         NULL
    3  5wb8x                                         NULL
    4  5wb1j                                         NULL
    5  5yqwj                                         NULL
    6  5pq5i                                         NULL
    7  5u197                                         NULL
    8  5z2js                                         NULL
    9  6fqh0   333250, 4, 9, 19, 111, 3189, 3156, 5, 1091
    10 5tvr1                                         NULL
    11 5yqw4                                         NULL
    12 5qqps                                         NULL
    13 5yqvw                                         NULL
    14 5ygom                                         NULL
    15 5nc88                                         NULL
    16 5yg94                                         NULL
    17 65t9e                                         NULL
    18 5peck                                         NULL
    19 63pg1 247283, 17, 22, 35, 297, 5514, 5450, 6, 2971
    20 6cdvy        156705, 1, 2, 6, 112, 10933, 605, 170

   From my JSON file I want Id and whole "metrics": {
        "impressions": {},
        "tweets_send": {},
        "qualified_impressions": {},
        "follows": {},
        "app_clicks": {},
        "retweets": {},
        "likes": {},
        "engagements": {},
        "clicks": {},
        "card_engagements": {},
        "replies": {},
        "url_clicks": {},
        "carousel_swipes": {}
      }
      and convert to Data Frame to load into Data Base. Plzz Help..!

How can I parsed this JSON Object. I want to retrieve Id & whole Metrics object. Then want to convert into Data Frame to load into SQL Table.

To read the multiple Id's & Metrics value I used below code,

`test <- list()
 for(i in 1:len)
 { test <- unlist(stats_campaign.data$data[[i]])
 print(test)}`

 **Output:**
      id 
  "5wcaz" 
      id 
   "5ub2u" 
      id 
  "5wb8x" 
      id 
 "5wb1j" 
      id 
 "5yqwj" 
      id 
  "5pq5i" 
      id 
  "5u197" 
      id 
  "5z2js" 
      id 
  "5tvr1" 
      id 
  "5yqw4" 
      id 
  "5qqps" 
      id 
  "5yqvw" 
      id 
  "5ygom" 
      id 
  "5nc88" 
      id 
  "5yg94" 
      id 
  "65t9e" 
      id 
  "5peck" 
                     id id_data.metrics.impressions 
                   "63pg1"                    "133227" 
                      id_data.metrics.tweets_send     id_data.metrics.follows 
                   "10"                         "9" 
                      id_data.metrics.retweets       id_data.metrics.likes 
                   "17"                        "96" 
                    id_data.metrics.engagements      id_data.metrics.clicks 
                 "2165"                      "2134" 
                    id_data.metrics.replies  id_data.metrics.url_clicks 
                    "5"                      "1204" 
                     id id_data.metrics.impressions 
                "6cdvy"                    "176164" 
     id_data.metrics.tweets_send    id_data.metrics.retweets 
                    "2"                        "10" 
    id_data.metrics.likes id_data.metrics.engagements 
                  "121"                      "9708" 
    id_data.metrics.clicks  id_data.metrics.url_clicks 
                  "620"                       "160"

Within a for I have to used list or something else to append the value each time, how can I do that ..?? Am I using a right Approach.?? Is there any alternative way I can parsed nested JSON object and directly put into Data Frame..?

Please Help..! Thanks In Advance..!

mike_p
  • 21
  • 2
  • 2
    If your JSON is syntactically valid, then in R you could do `jsonlite::fromJSON(your_text)`. However you seem to have some issues with your brackets. – Gregor Thomas Nov 12 '16 at 04:33
  • This is my JOSN FOrmat, – mike_p Nov 14 '16 at 20:11
  • Okay, your JSON is now valid. You can run `jsonlite::fromJSON(your_text)` on it and get a useful result. What do you want? Rather than showing the output you *don't* want, can you show the output you *do* want? – Gregor Thomas Nov 14 '16 at 20:21
  • @Gregor Plz see the update in Questions, I put the output whn I am reading a JSON data. To move forward I want to store these values into Data Frame and load into Data Base. ? Is there any way I can pased my JSON value ..?? Plz help – mike_p Nov 15 '16 at 04:41

1 Answers1

0

As mentioned in the comments, a bit more information about what output you are looking for would be helpful. In any case, I am hopeful that the following will provide a helpful direction. The tidyjson README provides a bit of helpful overview.

Unfortunately, the lack of data in your JSON object makes it difficult to illustrate what might be present in your data (what to expect in the null objects), and I am having difficulty determining what part of the Twitter API you are looking at. tidyjson gives you the ability to produce a consistent data.frame output, even when you have no data, though! The key verbs are gather and spread, much like tidyr, but with JSON flavor.

str <- "{\"data_type\":[\"stats\"],\"time_series_length\":[1],\"data\":[{\"id\":[\"XXXXX\"],\"id_data\":[{\"segment\":{},\"metrics\":{\"impressions\":{},\"tweets_send\":{},\"qualified_impressions\":{},\"follows\":{},\"app_clicks\":{},\"retweets\":{},\"likes\":{},\"engagements\":{},\"clicks\":{},\"card_engagements\":{},\"replies\":{},\"url_clicks\":{},\"carousel_swipes\":{}}}]},{\"id\":[\"XXXX1\"],\"id_data\":[{\"segment\":{},\"metrics\":{\"impressions\":{},\"tweets_send\":{},\"qualified_impressions\":{},\"follows\":{},\"app_clicks\":{},\"retweets\":{},\"likes\":{},\"engagements\":{},\"clicks\":{},\"card_engagements\":{},\"replies\":{},\"url_clicks\":{},\"carousel_swipes\":{}}}]}]} "

library(dplyr)
library(tidyjson)

prep <- as.tbl_json(str) %>% enter_object("data") %>% gather_array("objid")

p1 <- prep %>% enter_object("id") %>% 
  gather_array("idnum") %>% append_values_string("id")

p2 <- prep %>% enter_object("id_data") %>% gather_array("datanum") %>%
enter_object("metrics") %>% 
spread_values(
 impressions = jstring("impressions", "value")
 , tweets_send = jnumber("tweets_send", "somekey")
)

p1 %>% tbl_df() %>% left_join(p2 %>% tbl_df(), by = c("document.id", "objid"))
#> # A tibble: 2 x 7
#>   document.id objid idnum    id datanum impressions tweets_send
#>         <int> <int> <int> <chr>   <int>       <chr>       <dbl>
#> 1           1     1     1 XXXXX       1        <NA>          NA
#> 2           1     2     1 XXXX1       1        <NA>          NA
cole
  • 1,737
  • 2
  • 15
  • 21