I am using pyspark that produces a nested json that looks like below :
{
"batch_key": 1,
"client_key": 1,
"client_name": "ABC",
"Claims": [
{
"claim_key": "A",
"client_key": "B",
"client_name": "ATT"
},
{
"claim_key": "B",
"client_key": "B",
"client_name": "ATT"
}
]
}
but Ideally it should be divided into equal parts, like below:
{
"batch_key": 1,
"client_key": 1,
"client_name": "ABC",
"Claims": [
{
"claim_key": "A",
"client_key": "B",
"client_name": "ATT"
}
]
}
{
"batch_key": 1,
"client_key": 1,
"client_name": "ABC",
"Claims": [
{
"claim_key": "B",
"client_key": "B",
"client_name": "ATT"
}
]
}
The actual json payload would be much bigger, hence the above split is needed so that API can consume it properly. Is there a way to achieve the above using sparksql/pyspark/python?