0

I have JSON file with 6 records and one of the column(p_ccodes) contains below values

rec-1: "p_ccodes" : [ ],
rec-2: "p_ccodes" : [ [ "FLASHSALE" ] ],
rec-3: "p_ccodes" : [ [ "GRATISONGKIR" ] ],
rec-4: "p_ccodes" : [ [ "SAYALI13" ] ],
rec-5: "p_ccodes" : [ [ "testCappingIndo" ] ],
rec-6: "p_ccodes" : [ ],

I tried with below code:

df.withColumn("p_ccodes", explode(col("p_ccodes"))).withColumn("p_ccodes", explode(col("p_ccodes")))

output for that column as below which is expected but need to have all 6 records. I am getting only 4 instead of 6.

Output:

+--------------------+
|p_appliedcouponcodes|
+-----+---------------
|           FLASHSALE|
|        GRATISONGKIR|
|            SAYALI13|
|     testCappingIndo|
+-----+---------------`

Please suggest how can I get all 6 records with null value for other two records.

Alper t. Turker
  • 34,230
  • 9
  • 83
  • 115
  • I checked those options but not able to success... I tried with explode_outer but getting error might be i am using 2.1 version. – chalapathi p Apr 10 '18 at 12:25
  • I think your explode will work properly if u have "p_ccodes" : [[ ] ] for empty Array of Arrays – Abhi Apr 10 '18 at 15:31
  • Thanks Abhi for suggestion.. I am getting source file like with one square bracts( [ ]) with space. Can you suggest how we can change and resolve the problem.. thanks in advance.. – chalapathi p Apr 11 '18 at 10:21

0 Answers0