2

I have a CSV where the data is structured like so:

Key | Value
A | 1, 2, 3
B | 4, 5
C | 1, 3, 5

I want to convert it into a CSV that looks like this:

Key | Value
1 | A, C
2 | A
3 | A, C
4 | B
5 | B, C

The current values are strings (not numbers) separated by commas. The keys are not unique - a key can appear multiple times, with different values. I've only done this previously where there was one value per line, not multiple.

morais12
  • 33
  • 1
  • 6
  • Does this answer your question? [Make a dictionary with duplicate keys in Python](https://stackoverflow.com/questions/10664856/make-a-dictionary-with-duplicate-keys-in-python) – buran Mar 04 '20 at 15:11
  • If the file actually delimits the leftmost column with the vertical pipe symbol `|` I wouldn't really call that a csv. – Caleb McNevin Mar 04 '20 at 15:14
  • Apologies, I did this as a visual marker for me so as not to confuse the multiple commas - but yes, it is a csv – morais12 Mar 04 '20 at 15:16

1 Answers1

3

Assuming that the input has been loaded into a dictionary of lists, this should work for making the conversion:

from collections import defaultdict

input_dict = { 'A': [1, 2, 3], 'B': [4, 5], 'C': [1, 3, 5] }
output_dict = defaultdict(list)

for k, v in input_dict.items():
    for e in v:
        output_dict[e].append(k)

The result is as expected:

output_dict
=> defaultdict(<type 'list'>, {1: ['A', 'C'], 2: ['A'], 3: ['A', 'C'], 4: ['B'], 5: ['C', 'B']})

Now you can write the output to a CSV file with the format you want.

Óscar López
  • 232,561
  • 37
  • 312
  • 386