0

I have a nested dictionary that contains paths grouped by category and I want to create another dictionary with a similar structure, the different will be that the second dictionary will contain the files inside each path

original dictionary:

dic_paths={
    'folder1':{'data':['C:/Users/my_user/Desktop/Insumos1','C:/Users/my_user/Desktop/Insumos2']},
    'folder2':{'other_data':{'cat1':['C:/Users/my_user/Desktop/DATOS/to_share'],
                        'cat2':['C:/Users/my_user/Desktop/DATOS/others']},
             'other_other_data':{'f2sub-subgroup1':['C:/Users/my_user/Desktop/DATOS/graphs']}}
}

expected result:

dic_files={
    'folder1':{'data':['list of all files in two paths']},
    'folder2':{'other_data':{'cat1':['list of all files'],
                        'cat2':['list of all files']},
             'other_other_data':{'f2sub-subgroup1':['list of all files']}}
}

current result:

dic_files={
    'folder1':'folder1',
    'data':['all files in two paths'],
    'folder2':'folder2',
    'other_data':'other_data',
    'cat1':['list of files'],
    ...
}

this is the function that I'm using, I took the original function from here. Also, how can I move the data_dic={} inside the function, in a way it doesn't reset? Thanks for the help

data_dic={}
def myprint(d,data_dic):    
    for k, v in d.items():
        if isinstance(v, dict):
            data_dic[k]=k
            myprint(v,data_dic)
        else:
            file_list=[]
            for path in v:
                if type(path)!=list:                    
                    for file in os.listdir(path):
                        if '~$' not in file:
                            file_list.append(file)
                    data_dic[k]=file_list
    return data_dic
Luis Medina
  • 535
  • 3
  • 15

1 Answers1

1

It's perfect case where you can apply recursion. To iterate over folder I used Path.iterdir() and check each item using Path.is_file().

Code:

from pathlib import Path

def func(data):
    if isinstance(data, dict):
        return {k: func(v) for k, v in data.items()}  # recursion happens here
    elif isinstance(data, (list, tuple, set, frozenset)):
        return [str(p) for i in data for p in Path(i).iterdir() if p.is_file()]
    else:
        return data  # alternatively you can raise an exception

Usage:

dic_paths = {
    'folder1': {
        'data': [
            'C:/Users/my_user/Desktop/Insumos1',
            'C:/Users/my_user/Desktop/Insumos2'
        ]
    },
    'folder2': {
        'other_data': {
            'cat1': ['C:/Users/my_user/Desktop/DATOS/to_share'],
            'cat2':['C:/Users/my_user/Desktop/DATOS/others']
        },
        'other_other_data': {
            'f2sub-subgroup1': ['C:/Users/my_user/Desktop/DATOS/graphs']
        }
    }
}

dic_files = func(dic_paths)
Olvin Roght
  • 7,677
  • 2
  • 16
  • 35