I've a dataframe in spark, having one column which has json type data.
column3:
z:{
k:{
q1:null,
q2:1,
q3:23,
q4:null,
q5:{v1:null, v2:wers, v3:null}
a1:['sdsad','wqeqw'],
d1:'123_23'
},
l:{
w1:wwew
w2:null
w4:123
}
}
How can I process the content inside above json and perform some operations like: exploding column d1:'123_23' on '_' and add as another column in the data frame.
How can I read how many keys have not null values inside the json. And if there is any array then how to count the elements of that array.
So I do have data frame as :
Below is the example dataframe:
col1 : gf23431
col2 : 6728103
col3 : "z:{
k:{
q1:null,
q2:1,
q3:23,
q4:null,
q5:{v1:null, v2:wers, v3:null}
a1:['sdsad','wqeqw'],
d1:'123_23'
},
l:{
w1:wwew
w2:null
w4:123
}
}"
col4 : 3658
Desired Output columns:
Total keys under "k:" 7
Total non-null values under key "k:" 5 //count of keys having non-null values
Total keys under key "q5:" 3
Total non-null values under key "q5:" 1
Total values under "a1:" 2
split values under "d1:" and add another column 246 //multiply 1st vallue with 2 and add as another column in dataframe
so output columns will be:
col5 : 7
col6 : 5
col7 : 3
col8 : 1
col9 : 2
col10: 246