This is where Python's collections.defaultdict
becomes useful.
The problem with your loop is that you needed to collect the values for the same key into a list, but you kept resetting the value of my_dict[img_path]
to just the most recently parsed img_name
. With a defaultdict
, if the key doesn't exist yet, it can automatically initialize a default value, which in this case, you can make it an empty list. And then just keep .append
-ing to that list when you encounter the same key.
Demo:
>>> from collections import defaultdict
>>>
>>> dd = defaultdict(list)
>>> dict(dd)
{}
>>> dd["folder/1/img/file/1.mp3"].append("4.jpg")
>>> dd["folder/1/img/file/1.mp3"].append("8.jpg")
>>>
>>> dd["folder/1/img/file/6.mp3"].append("6.jpg")
>>>
>>> dict(dd)
{'folder/1/img/file/1.mp3': ['4.jpg', '8.jpg'], 'folder/1/img/file/6.mp3': ['6.jpg']}
Putting that into your code:
import csv
from collections import defaultdict
from pprint import pprint
my_dict = defaultdict(list)
with open("my_csv.csv", 'r') as file:
csvreader = csv.reader(file)
for row in csvreader:
for rw in row:
img_path, img_name = rw.rsplit("/", maxsplit=1)
my_dict[img_path].append(img_name)
pprint(dict(my_dict))
{'folder/1/img/file/1.mp3': ['4.jpg', '8.jpg'],
'folder/3/img/file/3.mp3': ['1.jpg', '5.jpg'],
'folder/6/img/file/6.mp3': ['6.jpg', '8.jpg'],
'folder/7/img/file/7.mp3': ['9.jpg']}
Note that if you strictly need a regular dict
type, you can convert a defaultdict
to a dict
by calling dict(...)
on it. But for most purposes, a defaultdict
behaves like a dict
.
Notice that I also changed how to parse the img_path
and img_name
on each line. Since those lines don't look like valid paths anyway and that you are not really using them in any file I/O operations, there is no point using os.path
. You can simply use str.rsplit
.
Lastly, you say that your input file is a CSV and you are using csv.reader
, but the contents aren't really CSV-formatted in the sense that it's basically one input string per line. You can do away with regular iteration over each line:
from collections import defaultdict
from pprint import pprint
my_dict = defaultdict(list)
with open("my_csv.csv", 'r') as file:
for line in file:
img_path, img_name = line.rstrip().rsplit("/", maxsplit=1)
my_dict[img_path].append(img_name)
pprint(dict(my_dict))
...which yields the same result:
{'folder/1/img/file/1.mp3': ['4.jpg', '8.jpg'],
'folder/3/img/file/3.mp3': ['1.jpg', '5.jpg'],
'folder/6/img/file/6.mp3': ['6.jpg', '8.jpg'],
'folder/7/img/file/7.mp3': ['9.jpg']}