Try to achieve the same goal.
I use Python 3.x and Pymongo.
My solution work for nested dict and nested list (amazing uh ?)
This function return a list of string that contains all path.
Let's take this nested mongo document :
{
"_id" : ObjectId("5d5166ab7773870c1e57c638"),
"eventId" : NumberLong(1),
"estimatedCharacterization" : {
"snrVerticalAxis" : 5.0,
"snrHorizontalAxis" : 5.0,
"distance" : 0.0,
"magnitude" : 0.0
},
"definition" : {
"requestType" : "DUST_DEVIL",
"requestedChannels" : [
{
"startTime" : {
"utc" : ISODate("2003-06-25T22:35:37.000Z"),
"utcString" : "2003-176T22:35:37.000",
"lmst" : "-5482T10:06:09.956",
"sclk" : NumberLong(109852537000),
"lobt" : NumberLong(112493764823),
"aobt" : NumberLong(109857192936)
},
"endTime" : {
"utc" : ISODate("2003-06-26T05:27:27.000Z"),
"utcString" : "2003-177T05:27:27.000",
"lmst" : "-5482T16:46:58.822",
"sclk" : NumberLong(109877247000),
"lobt" : NumberLong(112519067586),
"aobt" : NumberLong(109881902665)
},
"minDwnSamplRate" : 0.05
}
]
},
}
with my code will return :
eventId
estimatedCharacterization.snrVerticalAxis
estimatedCharacterization.snrHorizontalAxis
estimatedCharacterization.distance
estimatedCharacterization.magnitude
definition.requestType
definition.requestedChannels.XX.minDwnSamplRate
definition.requestedChannels.XX.startTime.utc
definition.requestedChannels.XX.startTime.utcString
definition.requestedChannels.XX.startTime.lmst
definition.requestedChannels.XX.startTime.sclk
definition.requestedChannels.XX.startTime.lobt
definition.requestedChannels.XX.startTime.aobt
definition.requestedChannels.XX.endTime.utc
definition.requestedChannels.XX.endTime.utcString
definition.requestedChannels.XX.endTime.lmst
definition.requestedChannels.XX.endTime.sclk
definition.requestedChannels.XX.endTime.lobt
definition.requestedChannels.XX.endTime.aobt
Here is the code (based on code of other comment, because he was very clear) :
def find_all_keys(collection):
def find_keys_in_doc(doc, pre=""):
found_keys = []
append = found_keys.append
if isinstance(doc, dict):
for k,v in doc.items():
if isinstance(v, dict):
found_keys += find_keys_in_doc(v, pre=pre+k+".")
elif isinstance(v, list):
found_keys += find_keys_in_doc(v, pre=pre+k+".")
else:
append(pre + k)
elif isinstance(doc, list):
for v in doc:
if isinstance(v, dict):
found_keys += find_keys_in_doc(v, pre=pre+"XX"+".")
elif isinstance(v, list):
found_keys += find_keys_in_doc(v, pre=pre+"XX"+".")
else:
append(pre + str(v))
return found_keys
all_keys = []
for doc in collection:
all_keys += find_keys_in_doc(doc)
return sorted(set(all_keys))
Have fun :)