2

I'm working on a JSON file with nested objects and would like to extract the child objects without converting them to their Scala case class equivalents. Is there any pre-built functionality to filter out chunks of JSON text this way?

For example, if I've got a JSON file with content similar to this:

{
  "parentObject": "bob",
  "parentDetail1": "foo",
  "subObjects": [
    {
      "childObjectName": "childname1",
      "detail1": "randominfo1",
      "detail2": "randominfo1"
    },
    {
      "childObjectName": "childname2",
      "detail1": "randominfo2",
      "detail2": "randominfo2"
    },
    {
      "childObjectName": "childname3",
      "detail1": "randominfo3",
      "detail2": "randominfo3"
    }
  ]
}

I would like to extract the subObjects nodes, ideally as individual chunks of JSON text (perhaps as an String Array with each subObject as an element). I know I could parse the entire JSON file into objects I've pre-defined in Scala classes, but would rather not take that route since this will probably be too expensive for larger files. I'm looking for a simple and elegant way to go here. Any ideas?

GroomedGorilla
  • 920
  • 2
  • 10
  • 30
  • In XML you would use a DOM for that and selectNodes(xpath). Not sure if JSON has something like that. Or you would use SAX.. hm.. Basically I think no matter how you do it, some parsing of the json document is unavoidable. – BitTickler May 19 '15 at 15:59
  • Following the SAX idea, you could look for something which works like sax. Basically it is a parser which reports what it encounters while parsing via some report interface. Your code could implement that interface and filter out what you need. It looks a bit like this: ``IJSONSax { EnterObject(); LeaveObject(); EnterArray(); LeaveArray; ... }`` even though I am not sure if this can work without extra meta data for JSON as the type information /name of your array elements is not contained in the document. – BitTickler May 19 '15 at 16:09
  • 1
    http://stackoverflow.com/questions/444380/is-there-a-streaming-api-for-json (closed in a questionable way) basically follows the idea I gave in my comments above. – BitTickler May 19 '15 at 16:27
  • @BitTickler The [Pull Parser API](https://github.com/json4s/json4s#low-level-pull-parser-api) provided by JSON4s may be an implementation of what you describe, though I am not sure, as I don't know what SAX really is. – Kulu Limpa May 19 '15 at 16:34
  • @KuluLimpa It is a push-style access api. – BitTickler May 19 '15 at 16:36

2 Answers2

2

solution using json-lenses and spray json

import spray.json.DefaultJsonProtocol._
import spray.json._
import spray.json.lenses.JsonLenses._

object Main extends App {

 val jsonData =
   """
     |{
     |  "parentObject": "bob",
     |  "parentDetail1": "foo",
     |  "subObjects": [
     |    {
     |      "childObjectName": "childname1",
     |      "detail1": "randominfo1",
     |      "detail2": "randominfo1"
     |    },
     |    {
     |      "childObjectName": "childname2",
     |      "detail1": "randominfo2",
     |      "detail2": "randominfo2"
     |    },
     |    {
     |      "childObjectName": "childname3",
     |      "detail1": "randominfo3",
     |      "detail2": "randominfo3"
     |    }
     |  ]
     |}
   """.stripMargin.parseJson


  val subObjectsLens = 'subObjects / *

  val subObjects = jsonData.extract[JsValue](subObjectsLens)

  println(subObjects map {_.compactPrint} mkString ", ")
}
0

Most of the JSON libraries provide some kind of feature to extract nested JSON. You haven't mentioned properly how you want the output(String Array with each subObject as an element ?? Do you want the fields of subObject to be merged into a single string??), I will leave the answer to extraction of nested JSON.

JSON4s

val json = parse(""" {
                    "parentObject": "bob",.... }""")
val subObjects = (json \"subObjects") 
// Returns a JArray(internal representation of JSON Array in Json4s). It has a flexible DSL
//which you can use to extract the fields as you like. 

Play-Json

val json = Json.parse("""{ "parentObject": "bob",.... }""")
val subObjects = (json \"subObjects")
//>subObjects  : play.api.libs.json.JsValue =  [{"childObjectName":"childname1", "detail1":"randominfo1", ....

Other libraries too should have similar features.

mohit
  • 4,968
  • 1
  • 22
  • 39