I'm writing a python script which parses "n" number of xml's first and creates a dict of dict's with key-value being xml attributes (inside nested dicts). Now, I want to group these nested dicts stored to know which xml's are same and can be grouped into same group. I want some pythonic way to group the same dict's where each dict happens to have same keys.
- I tried with parsing each dict and creating a string from the values. Store this string inside a dict where key = string and value = list of xmlNames. Now, when I go to the next dict and form the string, if it already exists in the dict, then I simply append the xml to this dict's value.
- I think there can be a better method based on groupby() or something else.
list_of_xmls = ["a.xml", "b.xml", "c.xml", "d.xml"]
dictXml = dict()
for xml in list_of_xmls:
dictXml[xml] = parseXml(xml) # Returns dict by parsing xml (key-value)
# parseXml(xml)
# It parses xml and returns dict like:
dict for a.xml = {"config":"4", "location":"C:\\xyz", "Group":"amcat"}
dict for b.xml = {"config":"4", "location":"C:\\xyz", "Group":"amcat"}
dict for c.xml = {"config":"5", "location":"C:\\mno", "Group":"alien"}
dict for d.xml = {"config":"5", "location":"C:\\mno", "Group":"alien"}
# Supoose, a.xml and b.xml have same values for all keys
# Same for c.xml and d.xml
# So, I should have two groups (a.xml, b.xml) and (c.xml, d.xml)
###########Some processing on the above dict ######
finalOutput = [["a.xml", "b.xml], ["c.xml", "d.xml"]]
Output should be list of groups which can be clubbed (basically list of lists).
Also, dictXml can be any other data structure as well like list of dicts. Any thoughts ?
Basically, the whole idea is given a list of xml's, I need to figure out which xml's are same based on key-values inside it. Group the same xml's in some list and do processing on each group.