I'd do this with a defaultdict
that produces defaultdict(list)
instances as default values.
Demo
>>> from collections import defaultdict
>>>
>>> d = defaultdict(lambda: defaultdict(list))
>>> data = [('schema1', 'table1', 'column_name1'), ('schema1', 'table1', 'column_name2'), ('schema1', 'table2', 'column_name3'), ('schema2', 'table3', 'column_name4')]
>>>
>>> for k1, k2, v in data:
...: d[k1][k2].append(v)
...:
>>> d
>>>
defaultdict(<function __main__.<lambda>()>,
{'schema1': defaultdict(list,
{'table1': ['column_name1', 'column_name2'],
'table2': ['column_name3']}),
'schema2': defaultdict(list, {'table3': ['column_name4']})})
To match your desired output exactly (although I don't see much reason), build a regular dictionary from d
with tuple
values.
>>> d = {k1:{k2:tuple(v2) for k2, v2 in v1.items()} for k1, v1 in d.items()}
>>> d
>>>
{'schema1': {'table1': ('column_name1', 'column_name2'),
'table2': ('column_name3',)},
'schema2': {'table3': ('column_name4',)}}
Explanation
The defaultdict
initializer accepts a callable (in this example an anonymous lambda
function is used). Whenever a key is missing, that callable is called and the return value is used as a fallback-value.
The line
d = defaultdict(lambda: defaultdict(list))
is creating a defaultdict
which creates another defaultdict
when a key is missing. The second defaultdict creates a list
when a key is missing.
>>> d = defaultdict(lambda: defaultdict(list))
>>> d['bogus']
>>> defaultdict(list, {})
>>> d['hokus']['pokus']
>>> []