1

I have a document in the below format,

   {
"glossary": {
    "title": "example glossary",
    "GlossDiv": {
        "title": "S",
        "GlossList": {
            "GlossEntry": {
                "ID": "SGML",
                "SortAs": "SGML",
                "GlossTerm": "Standard Generalized Markup Language",
                "Acronym": "SGML",
                "Abbrev": "ISO 8879:1986",
                "GlossDef": {
                    "para": "A meta-markup language, used to create markup languages such as DocBook.",
                    "GlossSeeAlso": ["GML", "XML"]
                },
                "GlossSee": "markup"
            }
        }
    }
}
}

I need to get all the keys including nested keys. But I can able to get only the keys in the first level, eg: glossary.

Can someone tell me, is there any way to retrive all the keys?

Swar
  • 182
  • 1
  • 11
  • Have a look at [this](http://stackoverflow.com/questions/2997004/using-map-reduce-for-mapping-the-properties-in-a-collection). – thegreenogre May 06 '15 at 09:27
  • I don't want to retrieve documents from collection, This is done before inserting document into MongoDB. – Swar May 06 '15 at 10:46

2 Answers2

2

You can use a recursive function to dig through every layer and print the key.

def recurse_keys(document):
    for key in document.keys():
        print(str(key))
        if isinstance(document[key], dict):
           recurse_keys(document[key])

UPDATE: For Nested Format

def recurse_keys(document,parent):
    for key in document.keys():
        if parent!="":
            print(parent+'.'+str(key))
        else:
            print str(key)
        if isinstance(document[key], dict):
            if parent!="":
                recurse_keys(document[key],parent+'.'+str(key))
            else:
                recurse_keys(document[key],str(key))
thegreenogre
  • 1,559
  • 11
  • 22
  • This works fine, Thanks much !! but is there any way to get key as a nested format? – Swar May 06 '15 at 13:30
  • Ya ,you can just send the parent key as another argument and append it before printing the key. I will update my answer to reflect it. – thegreenogre May 06 '15 at 13:35
1
from re import findall
input_dict = {
"glossary": {
    "title": "example glossary",
    "GlossDiv": {
        "title": "S",
        "GlossList": {
            "GlossEntry": {
                "ID": "SGML",
                "SortAs": "SGML",
                "GlossTerm": "Standard Generalized Markup Language",
                "Acronym": "SGML",
                "Abbrev": "ISO 8879:1986",
                "GlossDef": {
                    "para": "A meta-markup language, used to create markup languages such as DocBook.",
                    "GlossSeeAlso": ["GML", "XML"]
                },
                "GlossSee": "markup"
            }
        }
    }
}
}
dict1=str(input_dict)
pattern = r"'([A-Za-z0-9_\./\\-]*)':"
m = findall(pattern, dict1)
print m

m is :- ['glossary', 'GlossDiv', 'GlossList', 'GlossEntry', 'GlossDef', 'GlossSeeAlso', 'para', 'GlossSee', 'Acronym', 'GlossTerm', 'Abbrev', 'SortAs', 'ID', 'title', 'title']

Let me tell you this works fine if you simply want to have all keys but if you want them to be in nested form then better go the recursion way.Will provide the recursion code once it is done.

Please suggest me if some improvements are possible in my current solution.