0

say for example this is the JSON as below. i have actually took this from amazon website but i think this works for the question.

{
    "player": {
        "username": "user1",
        "characteristics": {
            "race": "Human",
            "class": "Warlock",
            "subclass": "Dawnblade",
            "power": 300,
            "playercountry": "USA"
        },
        "arsenal": {
            "kinetic": {
                "name": "Sweet Business",
                "type": "Auto Rifle",
                "power": 300,
                "element": "Kinetic"
            },
            "energy": {
                "name": "MIDA Mini-Tool",
                "type": "Submachine Gun",
                "power": 300,
                "element": "Solar"
            },
            "power": {
                "name": "Play of the Game",
                "type": "Grenade Launcher",
                "power": 300,
                "element": "Arc"
            }
        },
        "armor": {
            "head": "Eye of Another World",
            "arms": "Philomath Gloves",
            "chest": "Philomath Robes",
            "leg": "Philomath Boots",
            "classitem": "Philomath Bond"
        },
        "location": {
            "map": "Titan",
            "waypoint": "The Rig"
        }
    }
} 

I want to convert this to below and save as avro. I am new to spark programming so wrapping my head around the functional style is a bit difficult coming java background. please atleast guide me so i can write code by myself.

{
    "player.username": "user1",
    "player.characteristics.race": "Human",
    "player.characteristics.class": "Warlock",
    "player.characteristics.subclass": "Dawnblade",
    "player.characteristics.power": 300,
    "player.characteristics.playercountry": "USA",
    "player.arsenal.kinetic.name": "Sweet Business",
    "player.arsenal.kinetic.type": "Auto Rifle",
    "player.arsenal.kinetic.power": 300,
    "player.arsenal.kinetic.element": "Kinetic",
    "player.arsenal.energy.name": "MIDA Mini-Tool",
    "player.arsenal.energy.type": "Submachine Gun",
    "player.arsenal.energy.power": 300,
    "player.arsenal.energy.element": "Solar",
    "player.arsenal.power.name": "Play of the Game",
    "player.arsenal.power.type": "Grenade Launcher",
    "player.arsenal.power.power": 300,
    "player.arsenal.power.element": "Arc",
    "player.armor.head": "Eye of Another World",
    "player.armor.arms": "Philomath Gloves",
    "player.armor.chest": "Philomath Robes",
    "player.armor.leg": "Philomath Boots",
    "player.armor.classitem": "Philomath Bond",
    "player.location.map": "Titan",
    "player.location.waypoint": "The Rig"
}
Didier Dupont
  • 29,398
  • 7
  • 71
  • 90
oortcloud_domicile
  • 840
  • 6
  • 21
  • 41

1 Answers1

0

It is actually not much spark related, you just need a function to flatten the nested json string to a new json string. The code sample would be something if you have RDD[String] first.

rdd.mapPartitions(ps => ps.map(jsonFlatten)) 

and then convert the rdd to dataframe

a sample jsonFlatten can be found from here:

Play [Scala]: How to flatten a JSON object

Binzi Cao
  • 1,075
  • 5
  • 14
  • how can use those flatten methods while reading Multiple JSON Strings inside multiple files ? – oortcloud_domicile May 15 '18 at 18:56
  • I get below error when trying to use the method you mentioned. not sure what i am doing wrong. Type mismatch, expected: Iterator[Row] => Iterator[NotInferedU], actual: Iterator[Row] => Any Type mismatch, expected: Row => NotInferedB, actual: (JsValue, String) => JsObject Type mismatch, expected: Row => NotInferedB, actual: (JsValue, String) => JsObject – oortcloud_domicile May 15 '18 at 19:14