Two columns ("Name" & "Value") in excel.
There are duplicates (eg. "xxa","xxf") in the Value column and the python script needs to find what are the duplicates cell values and put them into an array
The output should be
{
"xxa": ["aaa","bbb","ccc","hhh"],
"xxf": ["fff","jjj"]
}
How to improve the current script?
file = open('columnData.csv')
csvreader = csv.reader(file)
next(csvreader)
for row in csvreader:
name = row[0]
value = row[1]
value_col.append(value)
name_value_col.append(name+","+value)
file.close()
count={}
names=[]
for item in value_col:
if value_col.count(item)>1:
count[item]=value_col.count(item)
for name,value in count.items():
names.append(name)
total=[]
for item in name_value_col:
item_name=item.split(",")
if item_name[1] in names:
total.append(item_name[0])
print(total)