I'm facing the problem of having a json file where the same key sometimes has a flat value, while others it has an additional nested (and for my purposes unnecessary) level which then includes the related value.
The file is newline delimited and I am trying to get rid of any additional levels. So far I've managed to do that only if the nested level appears in the first branch of the tree, using
jq -c '[.] | map(.[] |= if type == "object" and (.number | length) > 0 then .numberLong else . end) | .[]' mongoDB.json
The example below illustrates that further. What I have initially:
{
"name": "John",
"age": {
"numberLong": 22
}
}
{
"name": "Jane",
"age": 24
}
{
"name": "Dennis",
"age": 34,
"details": [
{
"telephone_number": 555124124
}
]
}
{
"name": "Frances",
"details": [
{
"telephone_number": {
"numberLong": 444245523
}
}
]
}
What my script does (the second numberLong
is ignored):
{
"name": "John",
"age": 22
},
{
"name": "Jane",
"age": 24
}
{
"name": "Dennis",
"age": 34,
"details": [
{
"telephone_number": 555124124
}
]
}
{
"name": "Frances",
"details": [
{
"telephone_number": {
"numberLong": 444245523
}
}
]
}
What I am actually hoping to achieve (recursively copy the values of all numberLong
keys one level up, regardless of where they belong in the file) :
[
{
"name": "John",
"age": 22
},
{
"name": "Jane",
"age": 24
},
{
"name": "Dennis",
"age": 34,
"details": [
{
"telephone_number": 555124124
}
]
},
{
"name": "Frances",
"details": [
{
"telephone_number": 444245523
}
]
}
]
This transformation is part of a daily pipeline and is applied to several files with sizes up to 70GB, so speed while traversing the files could potentially be an issue. The problem stems from MongoDB's different types: MongoDB differences between NumberLong and simple Integer?
Thanks!