The Input
I'm receiving input from an external JSON source, which contain paths. Follow this:
datalake-dev/facial_recognition/
datalake-dev/facial_recognition/curation/google-search-images/this_is_a_dir.png/pic0.jpg
datalake-dev/facial_recognition/curation/google-search-images/this_is_a_dir.png/pic1.jpg
datalake-dev/facial_recognition/curation/google-search-images/this_is_a_dir.png/pic10.png
datalake-dev/facial_recognition/curation/google-search-images/this_is_a_dir.png/pic11.jpg
datalake-dev/facial_recognition/curation/google-search-images/this_is_a_dir.png/pic12.png
datalake-dev/facial_recognition/curation/google-search-images/this_is_a_dir.png/pic13.jpg
datalake-dev/facial_recognition/landing/input-images/
datalake-dev/facial_recognition/landing/input-images/this_is_a_dir.png
The Help
from this, I need to pass it on in an API / JSON / Dictionary format for further processing. So far I've been through one, two, three and four threads. Nothing has sufficed to solution.
The Required Output
From the paths I need to get Dictionary / JSON format in following way:
{
"curation":{
"google-search-images":[
{
"name":"pic0"
},
{
"name":"pic1"
}
]
},
"derived":{
"recognition-matches":[
{
"name":"img2"
}
],
"errors":[
{
"name":"foo"
}
]
}
}
In the above Dictionary / JSON the names curation
, google-search-images
, this_is_a_dir.png
are all directories. I need something that recursively puts them into dictionary based on length of these paths.
My Trial
for contents in result['Contents']:
directory_or_file_list = contents['Key'].split('/') # To identify if the path is pointing as file / directory
path = contents['Key']
splitted_path = path.split('/')
# ['datalake-dev', 'facial_recognition', 'landing', 'input-images', 'this_is_a_dir.png', 'pic0.jpg']
if '' in splitted_path:
splitted_path.pop()
all_paths.append(splitted_path)
# The object 'api' holds the dictionary expected.
api[splitted_path[0]] = splitted_path[1]
# api[splitted_path[0]] = {splitted_path[1] : {splitted_path[2] : [append_all_elements_under_this]} }
if directory_or_file_list[-1].split('.')[-1] in ['jpg', 'jpeg', 'png', 'tiff']:
print(path)
else:
print(path)
Note: Perhaps there is a way to hard code, but then I wouldn't post it here it that'd be the case. Also, no chance of using os.walk(). Been there done that. It isn't OS File system.
Any help beside my code is welcomed!