5

Given the following JSON...

{
  "metadata": {
    "id": "1234",
    "type": "file",
    "length": 395
  }
}

... how do I convert it to

{
  "metadata.id": "1234",
  "metadata.type": "file",
  "metadata.length": 395
}

Tx.

j3d
  • 9,492
  • 22
  • 88
  • 172

6 Answers6

10

You can do this pretty concisely with Play's JSON transformers. The following is off the top of my head, and I'm sure it could be greatly improved on:

import play.api.libs.json._

val flattenMeta = (__ \ 'metadata).read[JsObject].flatMap(
  _.fields.foldLeft((__ \ 'metadata).json.prune) {
    case (acc, (k, v)) => acc andThen __.json.update(
      Reads.of[JsObject].map(_ + (s"metadata.$k" -> v))
    )
  }
)

And then:

val json = Json.parse("""
  {
    "metadata": {
      "id": "1234",
      "type": "file",
      "length": 395
    }
  }
""")

And:

scala> json.transform(flattenMeta).foreach(Json.prettyPrint _ andThen println)
{
  "metadata.id" : "1234",
  "metadata.type" : "file",
  "metadata.length" : 395
}

Just change the path if you want to handle metadata fields somewhere else in the tree.


Note that using a transformer may be overkill here—see e.g. Pascal Voitot's input in this thread, where he proposes the following:

(json \ "metadata").as[JsObject].fields.foldLeft(Json.obj()) {
  case (acc, (k, v)) => acc + (s"metadata.$k" -> v)
}

It's not as composable, and you'd probably not want to use as in real code, but it may be all you need.

Travis Brown
  • 138,631
  • 12
  • 375
  • 680
  • Beautiful Json transformer :) – Julien Lafont Jun 18 '14 at 07:52
  • 1
    Thanks, @JulienLafont, but actually I don't particularly like it—I feel like the code still kind of obscures the intent. There's probably a better way to do it. – Travis Brown Jun 18 '14 at 13:10
  • Still seems tight to me; wrapped in a function passing params - `def doDotNotation(jsPath: JsPath, prefix: String): JsObject`. Useful using a recursive to add dot notation to nested `JsObjects` – jesus g_force Harris Jan 12 '20 at 00:31
9

This is definitely not trivial, but possible by trying to flatten it recursively. I haven't tested this thoroughly, but it works with your example and some other basic one's I've come up with using arrays:

object JsFlattener {

    def apply(js: JsValue): JsValue = flatten(js).foldLeft(JsObject(Nil))(_++_.as[JsObject])

    def flatten(js: JsValue, prefix: String = ""): Seq[JsValue] = {
        js.as[JsObject].fieldSet.toSeq.flatMap{ case (key, values) =>
            values match {
                case JsBoolean(x) => Seq(Json.obj(concat(prefix, key) -> x))
                case JsNumber(x) => Seq(Json.obj(concat(prefix, key) -> x))
                case JsString(x) => Seq(Json.obj(concat(prefix, key) -> x))
                case JsArray(seq) => seq.zipWithIndex.flatMap{ case (x, i) => flatten(x, concat(prefix, key + s"[$i]")) }  
                case x: JsObject => flatten(x, concat(prefix, key))
                case _ => Seq(Json.obj(concat(prefix, key) -> JsNull))
            }
        }
    }

    def concat(prefix: String, key: String): String = if(prefix.nonEmpty) s"$prefix.$key" else key

}

JsObject has the fieldSet method that returns a Set[(String, JsValue)], which I mapped, matched against the JsValue subclass, and continued consuming recursively from there.

You can use this example by passing a JsValue to apply:

val json = Json.parse("""
    {
      "metadata": {
        "id": "1234",
        "type": "file",
        "length": 395
      }
    }
"""
JsFlattener(json)

We'll leave it as an exercise to the reader to make the code more beautiful looking.

Michael Zajac
  • 55,144
  • 7
  • 113
  • 138
  • 1
    I think "js.as[JsObject]" fails when this is not passed an object. "seq.zipWithIndex.flatMap{ case (x, i) => flatten(x..." will then only work if every element of the array is an object. Any instance of "x": ["a","b"] will throw an exception as flatten(x...) is passed a JsString, not a JsObject. – ADDruid May 16 '16 at 22:42
  • I am new to scala so how to call this function, and what import libraries we use for JsValue. If we pass the Dataframe then what changes we should make in this code? – MD Rijwan Oct 09 '19 at 09:01
4

Here's my take on this problem, based on @Travis Brown's 2nd solution.

It recursively traverses the json and prefixes each key with its parent's key.

def flatten(js: JsValue, prefix: String = ""): JsObject = js.as[JsObject].fields.foldLeft(Json.obj()) {
    case (acc, (k, v: JsObject)) => {
        if(prefix.isEmpty) acc.deepMerge(flatten(v, k))
        else acc.deepMerge(flatten(v, s"$prefix.$k"))
    }
    case (acc, (k, v)) => {
        if(prefix.isEmpty) acc + (k -> v)
        else acc + (s"$prefix.$k" -> v)
    }
}

which turns this:

{
  "metadata": {
    "id": "1234",
    "type": "file",
    "length": 395
  },
  "foo": "bar",
  "person": {
    "first": "peter",
    "last": "smith",
    "address": {
      "city": "Ottawa",
      "country": "Canada"
    }
  }
}

into this:

{
  "metadata.id": "1234",
  "metadata.type": "file",
  "metadata.length": 395,
  "foo": "bar",
  "person.first": "peter",
  "person.last": "smith",
  "person.address.city": "Ottawa",
  "person.address.country": "Canada"
}
Trev
  • 1,358
  • 3
  • 16
  • 28
1

@Trev has the best solution here, completely generic and recursive, but it's missing a case for array support. I'd like something that works in this scenario:

turn this:

{
  "metadata": {
    "id": "1234",
    "type": "file",
    "length": 395
  },
  "foo": "bar",
  "person": {
    "first": "peter",
    "last": "smith",
    "address": {
      "city": "Ottawa",
      "country": "Canada"
    },
    "kids": ["Bob", "Sam"]
  }
}

into this:

{
  "metadata.id": "1234",
  "metadata.type": "file",
  "metadata.length": 395,
  "foo": "bar",
  "person.first": "peter",
  "person.last": "smith",
  "person.address.city": "Ottawa",
  "person.address.country": "Canada",
  "person.kids[0]": "Bob",
  "person.kids[1]": "Sam"
}

I've arrived at this, which appears to work, but seems overly verbose. Any help in making this pretty would be appreciated.

def flatten(js: JsValue, prefix: String = ""): JsObject = js.as[JsObject].fields.foldLeft(Json.obj()) {
  case (acc, (k, v: JsObject)) => {
    val nk = if(prefix.isEmpty) k else s"$prefix.$k"
    acc.deepMerge(flatten(v, nk))
  }
  case (acc, (k, v: JsArray)) => {
    val nk = if(prefix.isEmpty) k else s"$prefix.$k"
    val arr = flattenArray(v, nk).foldLeft(Json.obj())(_++_)
    acc.deepMerge(arr)
  }
  case (acc, (k, v)) => {
    val nk = if(prefix.isEmpty) k else s"$prefix.$k"
    acc + (nk -> v)
  }
}

def flattenArray(a: JsArray, k: String = ""): Seq[JsObject] = {
  flattenSeq(a.value.zipWithIndex.map {
    case (o: JsObject, i: Int) =>
      flatten(o, s"$k[$i]")
    case (o: JsArray, i: Int) =>
      flattenArray(o, s"$k[$i]")
    case a =>
      Json.obj(s"$k[${a._2}]" -> a._1)
  })
}

def flattenSeq(s: Seq[Any], b: Seq[JsObject] = Seq()): Seq[JsObject] = {
  s.foldLeft[Seq[JsObject]](b){
    case (acc, v: JsObject) =>
      acc:+v
    case (acc, v: Seq[Any]) =>
      flattenSeq(v, acc)
  }
}
ADDruid
  • 196
  • 1
  • 6
  • how to call this method for a large number for json strings inside multiple files? – oortcloud_domicile May 15 '18 at 19:06
  • Hey I am also looking for a similar kind of solution but I want my JSONs to be splitted into N no.of jsons according to the N no.of elements. Here is my question and still unanswered.: https://stackoverflow.com/questions/51668341/flat-nested-json-to-header-level-using-scala – Naman Agarwal Sep 27 '18 at 07:49
0

Thanks m-z, it is very helpful. (I'm not so familiar with Scala.)

I'd like to add a line for "flatten" working with primitive JSON array like "{metadata: ["aaa", "bob"]}".

  def flatten(js: JsValue, prefix: String = ""): Seq[JsValue] = {

    // JSON primitive array can't convert to JsObject
    if(!js.isInstanceOf[JsObject]) return Seq(Json.obj(prefix -> js))

    js.as[JsObject].fieldSet.toSeq.flatMap{ case (key, values) =>
      values match {
        case JsBoolean(x) => Seq(Json.obj(concat(prefix, key) -> x))
        case JsNumber(x) => Seq(Json.obj(concat(prefix, key) -> x))
        case JsString(x) => Seq(Json.obj(concat(prefix, key) -> x))
        case JsArray(seq) => seq.zipWithIndex.flatMap{ case (x, i) => flatten(x, concat(prefix, key + s"[$i]")) }
        case x: JsObject => flatten(x, concat(prefix, key))
        case _ => Seq(Json.obj(concat(prefix, key) -> JsNull))
      }
    }
  }
0

Based on previous solutions, have tried to simplify the code a bit

  def getNewKey(oldKey: String, newKey: String): String = {
    if (oldKey.nonEmpty) oldKey + "." + newKey else newKey
  }

  def flatten(js: JsValue, prefix: String = ""): JsObject = {
    if (!js.isInstanceOf[JsObject]) return Json.obj(prefix -> js)
    js.as[JsObject].fields.foldLeft(Json.obj()) {
      case (o, (k, value)) => {
        o.deepMerge(value match {
          case x: JsArray => x.as[Seq[JsValue]].zipWithIndex.foldLeft(o) {
            case (o, (n, i: Int)) => o.deepMerge(
              flatten(n.as[JsValue], getNewKey(prefix, k) + s"[$i]")
            )
          }
          case x: JsObject => flatten(x, getNewKey(prefix, k))
          case x => Json.obj(getNewKey(prefix, k) -> x.as[JsValue])
        })
      }
    }
  }
ghosts
  • 177
  • 2
  • 15