0

I need to convert a a data frame to a json format, my data frame looks like this:

dput(head(yyy,30))
structure(list(name = c("serverA", "serverA", "serverA", "serverA", 
"serverA", "serverA", "serverA", "serverA", "serverA", "serverA", 
"serverA", "serverA", "serverA", "serverA", "serverA", "serverA", 
"serverA", "serverA", "serverA", "serverA", "serverA", "serverA", 
"serverA", "serverA", "serverA", "serverA", "serverA", "serverA", 
"serverA", "serverA"), date = structure(c(1374120000, 1374120060, 
1374120120, 1374120180, 1374120360, 1374120420, 1374120540, 1374120600, 
1374120840, 1374120960, 1374121020, 1374121080, 1374121200, 1374121440, 
1374121500, 1374121620, 1374121680, 1374122040, 1374122160, 1374122280, 
1374122400, 1374122580, 1374122640, 1374122700, 1374122940, 1374123000, 
1374123120, 1374123180, 1374123240, 1374123360), class = c("POSIXct", 
"POSIXt"), tzone = "America/New_York"), resp = c(3644, 1067.5, 
2738, 5224, 561, 723, 522, 408.5, 446, 683.75, 521, 385, 2666.5, 
1268, 701, 143, 645, 474, 670.5, 549, 383, 1381, 483, 516, 467.5, 
10726, 931.5, 773, 778, 323), vol = c(1L, 2L, 1L, 1L, 1L, 1L, 
1L, 2L, 2L, 4L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 
1L, 1L, 2L, 1L, 2L, 2L, 1L, 1L)), .Names = c("name", "date", 
"resp", "vol"), row.names = c(1L, 3L, 7L, 8L, 13L, 17L, 21L, 
25L, 33L, 39L, 42L, 44L, 48L, 59L, 63L, 68L, 71L, 83L, 88L, 91L, 
98L, 105L, 109L, 113L, 122L, 123L, 129L, 132L, 135L, 140L), class = "data.frame")

my json needs to be like this:

first part will include date and resp and second part will need to inclue date and vol on the json format.

[{"name":"serverA","yAxis":1, "data":[[<date>,<resp>],[<date>,<resp>]]},{"name":"serverA","yAxis":2, "data":[[<date>,<vol>],[<date>,<vol>]]}]

the json out put needs to be like this:

[{"name":"serverA","yAxis":1, "data":[[1374105840000,27.395],[1374107640000,26.646]]},{"name":"serverA","yAxis":2, "data":[[1374105840000,25.983],[1374107640000,22.724]]}]

If the data frame only include resp, I could do something like to this convert this:

servers <- split(yyy, yyy$name)
dumFun <- function(x){
  sData <- servers[x][[1]]
  if(nrow(sData) >0){
    # create appropriate list
    dumList <- unname(apply(sData[,2:3], 1, function(y) unname(as.list(y))))
    return(toJSON(list(name = x, data = dumList))) 
  }
}


jsData <- lapply(names(servers), dumFun)
jsInd <- sapply(jsData, is.null)
p<-paste0('[', paste(jsData[!jsInd], collapse = ','), ']')

is there an easy way to do this in R

user1471980
  • 10,127
  • 48
  • 136
  • 235
  • Does this http://stackoverflow.com/a/15138399/429846 answer your question? – Gavin Simpson Jul 18 '13 at 15:23
  • there's no yaxis variable in the sample data you posted – David Marx Jul 18 '13 at 15:27
  • Also, I've seen you post questions before and you always use very, very strange methods to build up your sample data. Why don't you just use the simpler `data.frame` method instead of using `structure` with `.Names` and `rownames` attributes and `class="data.frame"`? Your code is generally very un-R and I want to make sure you're aware that there are much easier ways to do a lot of the things you seem to jump through hoops to accomplish (like creating a simple dataframe). – David Marx Jul 18 '13 at 15:32
  • @DavidMarx No, this is **exactly** what we want people to post here if they can post exact data. Notice how the line *before* `structure` is `dput(head(yyy,3))`, that is the R code generating a reproducible and portable representation of the data (in this case the first 30 rows of data frame `yyy`). All one needs to grab this data is do `yyy <- structure(....)` (i.e. copy and paste the `dput` output, assigning it to an object. – Gavin Simpson Jul 18 '13 at 15:50
  • @GavinSimpson I figured it was something like that. I still disagree that this is the best way to do it. Having huge blocks of code like this makes it more difficult to understand the datastructure at a glance. Maybe we should move this discussion into meta or a chatroom though since we're getting off topic. Thanks for pointing out `dput` though, that at least explains it a bit. – David Marx Jul 18 '13 at 15:55
  • @DavidMarx Then you are in the minority here. There are too many non-controllable things that might mean you and I running the same chunk of R code end up with subtly different objects. The `stringsAsFactors` global option for example. It is **far** better to use `dput` as here to provide a succinct example of the data actually in use. It is very easy to run that in R and look at the **exact** data structure, rather than infer things just by looking. – Gavin Simpson Jul 18 '13 at 15:58
  • @DavidMarx In support of my viewpoint, the [tag:r] community has curated this FAQ: http://stackoverflow.com/q/5963269/429846 which suggests the use of `dput` in non-simple cases. – Gavin Simpson Jul 18 '13 at 16:02
  • @GavinSimpson Fair enough – David Marx Jul 18 '13 at 16:04
  • @GavinSimpson actually, scratch that. The link you posted says that a minimal dataset should be provided and *first* suggests building it up using `data.frame`. It then goes on to say: "f you have some data that would be too difficult to construct using these tips, then you can always make a subset of your original data, using ... `dput`". This does not in anyway suggest that using `dput` should be the default representation for data examples. It actually seems to suggest the opposite. – David Marx Jul 18 '13 at 16:09
  • @GavinSimpson A good justification for this is that we want to address the OP's specific problem, but we also want to provide solutions that are generalizable. If you have an analogous problem and you're just googling around, it's much easier to see how your problem may be similar to one posted here if a minimal representation is used for the dataset as opposed to complex output like dput. dput has its place, but this user seems to use it for *every single question they post*. We should be able to help in a manner that makes our solutions identifiable to those with similar problems. – David Marx Jul 18 '13 at 16:12
  • @DavidMarx You didn't read what I wrote. I explicitly mentioned "... in non-simple cases". Producing the data from scratch would have required, *inter alia* `paste()`, `as.POSIXct()`, plus a random number draw. I suppose you could just have done `data.frame(....)`, but berating them for doing something our FAQ suggests *as an option* is somewhat un-called for. – Gavin Simpson Jul 18 '13 at 16:20
  • @DavidMarx Also, the identifying feature here is the nature of the **output**, not the data itself. The question title covers that. – Gavin Simpson Jul 18 '13 at 16:22
  • @GavinSimpson The fact that the "identifying feature here is the output, not the data" is precisely why I feel OP could have provided a much simpler example. That a column is posix dates is irrelevant to transforming a dataframe into JSON. – David Marx Jul 18 '13 at 16:23
  • @DavidMarx Perhaps the OP didn't know that, or wasn't sure. Seriously, the OP produced a *reproducible* example FFS! and you're giving them grief? There are plenty of useless questions in [tag:r] that you can go vent your spleen over. – Gavin Simpson Jul 18 '13 at 16:26
  • @DavidMarx, here I was asked to provide the data in dput format, which is very straigh forward. I dont get your concern. – user1471980 Jul 18 '13 at 16:28
  • I'm not sure what you mean by "I was asked to provide the data in dput format." I don't see that request anywhere here. I hope I'm not being rude or sound like a nutjob, but my "concern" is that you literally post on avg a question a day, and you've posted 41 `r` questions in the last 3 months (per SO data explorer). Considering the regularity with which you post questions, it would be nice if you could post friendlier sample data. I personally don't like looking at dput output, and I've stated a few reasons why I don't think it's good for the community for this to be the default. – David Marx Jul 18 '13 at 17:08
  • Citations for your posting frequency-- r tagged questions (last 90 days): http://data.stackexchange.com/stackoverflow/query/124567, all questions (last 90 days): http://data.stackexchange.com/stackoverflow/query/124573. I wouldn't be so insistent if it weren't for the fact that I see you on here *all the time*. It's strange that I can recognize someone posting questions. Posting answers maybe, but you post a lot of questions. – David Marx Jul 18 '13 at 17:11

2 Answers2

0

Your sample data doesn't include some of the variables you requested in the output, but I think this should work for you.

library(rjson)

toJSON(data.frame(t(yyy)))
David Marx
  • 8,172
  • 3
  • 45
  • 66
0

I was able to do this in two steps as follows:

dumFun <- function(x){
          sData <- servers[x][[1]]
          if(nrow(sData) >0){
            # create appropriate list
            dumList <- unname(apply(sData[,2:3], 1, function(y) unname(as.list(y))))
            return(toJSON(list(name = x, yAxis=1, data = dumList))) 
          }
        }

        dumFun1 <- function(x){
          sData <- servers[x][[1]]

          if(nrow(sData) >0){
            # create appropriate list
            dumList <- unname(apply(sData[,c(2,4)], 1, function(y) unname(as.list(y))))
            return(toJSON(list(name = x, yAxis=2, data = dumList))) 
          }
        }

        jsData <- lapply(names(servers), dumFun)
        jsInd <- sapply(jsData, is.null)

        jsData1 <- lapply(names(servers), dumFun1)
        jsInd <- sapply(jsData, is.null)

        t<-paste(jsData, jsData1, sep=',')

        p<-paste0('[', paste(t[!jsInd], collapse = ','), ']')
user1471980
  • 10,127
  • 48
  • 136
  • 235