0

I would like to JSON serialize an expressionSet. I tried the following:

# create expression set based on the link above
library("Biobase")

ExpressionSet()

ExpressionSet(assayData=matrix(runif(1000), nrow=100, ncol=10))

# update an existing ExpressionSet
data(sample.ExpressionSet)
updateObject(sample.ExpressionSet)

# information about assay and sample data
featureNames(sample.ExpressionSet)[1:10]
sampleNames(sample.ExpressionSet)[1:5]
experimentData(sample.ExpressionSet)

# subset: first 10 genes, samples 2, 4, and 10
expressionSet <- sample.ExpressionSet[1:10,c(2,4,10)]

When I then do (using the same approach as for dataframes):

library(jsonlite)

toJSON(expressionSet)

I get

Error: No method for S4 class:ExpressionSet

Is there a way to get this done or would I have to write a custom serializer?

Cleb
  • 25,102
  • 20
  • 116
  • 151
  • you can use `toJSON(expressionSet, force = TRUE)` to force unknown objects through. It usually works on any class of object – SymbolixAU Jul 08 '19 at 00:16
  • @SymbolixAU: Feel free to add it as an actual answer along with its output. More than happy to upvote. – Cleb Jul 08 '19 at 14:53

2 Answers2

1

I think this does what you're after. I don't really know the field, so if the JSON object isn't as expected, please comment and I can try to update it.

My approach to solving this problem involves converting an object of ExpressionSet class to a dataframe so that we can use toJSON() on it. I found the idea here: https://support.bioconductor.org/p/77432/

# create expression set based on the link above
library(Biobase)

ExpressionSet()

ExpressionSet(assayData=matrix(runif(1000), nrow=100, ncol=10))

# update an existing ExpressionSet
data(sample.ExpressionSet)
updateObject(sample.ExpressionSet)

# information about assay and sample data
featureNames(sample.ExpressionSet)[1:10]
sampleNames(sample.ExpressionSet)[1:5]
experimentData(sample.ExpressionSet)

# subset: first 10 genes, samples 2, 4, and 10
expressionSet <- sample.ExpressionSet[1:10,c(2,4,10)]

# this code is inspired from here: https://support.bioconductor.org/p/77432/
m <- exprs(eset) # matrix of intensities
pdata <- pData(eset) # data.frame of phenotypic information.
d <- cbind(pdata, t(m))

library(jsonlite)
toJSON(d)

[{"sex":"Male","type":"Case","score":0.4,"AFFX-MurIL2_at":85.7533,"AFFX-MurIL10_at":126.196,"AFFX-MurIL4_at":8.8314,"AFFX-MurFAS_at":3.6009,"AFFX-BioB-5_at":30.438,"AFFX-BioB-M_at":25.8461,"AFFX-BioB-3_at":181.08,"AFFX-BioC-5_at":57.2889,"AFFX-BioC-3_at":16.8006,"AFFX-BioDn-5_at":16.1789,"_row":"B"},{"sex":"Male","type":"Case","score":0.42,"AFFX-MurIL2_at":135.575,"AFFX-MurIL10_at":93.3713,"AFFX-MurIL4_at":28.7072,"AFFX-MurFAS_at":12.3397,"AFFX-BioB-5_at":70.9319,"AFFX-BioB-M_at":69.9766,"AFFX-BioB-3_at":161.469,"AFFX-BioC-5_at":77.2207,"AFFX-BioC-3_at":46.5272,"AFFX-BioDn-5_at":9.7364,"_row":"D"},{"sex":"Male","type":"Control","score":0.63,"AFFX-MurIL2_at":135.608,"AFFX-MurIL10_at":90.4838,"AFFX-MurIL4_at":34.4874,"AFFX-MurFAS_at":4.5498,"AFFX-BioB-5_at":46.352,"AFFX-BioB-M_at":91.5307,"AFFX-BioB-3_at":229.671,"AFFX-BioC-5_at":66.7302,"AFFX-BioC-3_at":39.7419,"AFFX-BioDn-5_at":0.3988,"_row":"J"}]
meenaparam
  • 1,949
  • 2
  • 17
  • 29
  • Thanks, I ended up doing something very similar using a named list (see my answer), which gives me want I want. In your solution, I think, the information on the datatype gets lost as it is now an unnamed structure. So, going back to the original expressionset using your expression might be tricky (but upvoted). – Cleb Jul 05 '19 at 17:33
  • @Cleb Glad you found a solution to your problem! The point you make on the information on the datatype is the sort of thing I wouldn't have realised was relevant when I wrote my answer code, so thanks for pointing that out. I realise our resulting JSON structures are a bit different too. – meenaparam Jul 08 '19 at 14:21
0

I ended up using a named list like this:

expressionset_to_json <- function(eset) {

  expression_data <- Biobase::exprs(eset)
  sample_info <- Biobase::pData(eset)
  feature_data <- Biobase::fData(eset)  

  templist = list(
    expression_data=as.data.frame(expression_data),
    sample_info=sample_info,
    feature_data=feature_data
  )

  return(jsonlite::toJSON(templist))
}

Then

expressionset_to_json(expressionSet)

yields

{"expression_data":[{"B":85.7533,"D":135.575,"J":135.608,"_row":"AFFX-MurIL2_at"},{"B":126.196,"D":93.3713,"J":90.4838,"_row":"AFFX-MurIL10_at"},{"B":8.8314,"D":28.7072,"J":34.4874,"_row":"AFFX-MurIL4_at"},{"B":3.6009,"D":12.3397,"J":4.5498,"_row":"AFFX-MurFAS_at"},{"B":30.438,"D":70.9319,"J":46.352,"_row":"AFFX-BioB-5_at"},{"B":25.8461,"D":69.9766,"J":91.5307,"_row":"AFFX-BioB-M_at"},{"B":181.08,"D":161.469,"J":229.671,"_row":"AFFX-BioB-3_at"},{"B":57.2889,"D":77.2207,"J":66.7302,"_row":"AFFX-BioC-5_at"},{"B":16.8006,"D":46.5272,"J":39.7419,"_row":"AFFX-BioC-3_at"},{"B":16.1789,"D":9.7364,"J":0.3988,"_row":"AFFX-BioDn-5_at"}],"sample_info":[{"sex":"Male","type":"Case","score":0.4,"_row":"B"},{"sex":"Male","type":"Case","score":0.42,"_row":"D"},{"sex":"Male","type":"Control","score":0.63,"_row":"J"}],"feature_data":[{"_row":"AFFX-MurIL2_at"},{"_row":"AFFX-MurIL10_at"},{"_row":"AFFX-MurIL4_at"},{"_row":"AFFX-MurFAS_at"},{"_row":"AFFX-BioB-5_at"},{"_row":"AFFX-BioB-M_at"},{"_row":"AFFX-BioB-3_at"},{"_row":"AFFX-BioC-5_at"},{"_row":"AFFX-BioC-3_at"},{"_row":"AFFX-BioDn-5_at"}]}
Cleb
  • 25,102
  • 20
  • 116
  • 151