I have this scenerio to capture the JSON path if it doesn't match with the JSON paths provided in the code, because the size of the source JSON data I get is huge and the schema is unpredictable ...
It varies like the below provided JSON Values.
Below is the sample JSON values:
JSON Value 1: (Known Schema)
main.pull.notify.roit.lerk
[{
"main" : {
"pull" :{
"notify" : {
"roit" : {
"lerk" : "value_a"
}
}
}
}
}]
JSON Value 2: (Unknown Schema)
main.pull.notify.roit.late_lerk
[{
"main" : {
"pull" :{
"notify" : {
"roit" : {
"late_lerk" : "value_a"
}
}
}
}
}]
Below is the current code I use to capture the JSON values based on the Known Schema
df = df.withColumn('lerk_value', when(df.main.pull.notify.roit.lerk.isNotNull(), df["main.pull.notify.roit.lerk"]).otherwise(""))
So, let's assume on today's run my code is reading the data from the source, since the JSON values in the source is unpredictable the Known Schema didn't appear today. But, the unknown schema is provided with the same valid data (value_a). Also, the code will fail due to schema mismatch.
Is it possible to create a fail safe to capture the path of the unknown schema like this main.pull.notify.roit.late_lerk
then print it and continue the code!?
Help is much appreciated!!
Thanks in advance.