2

I have a CSV file (out.txt) with the following format

red,green,blue
banana,apple,orange

I am trying to generate all two combinations so that the output is put to output.csv like the following

[red,green][red,blue][green,blue]
[banana,apple][banana,orange][apple,orange]

My code that works for single line is

import csv

with open('out.txt', newline='') as csvfile:
    csvdata = list(csv.reader(csvfile))

print(csvdata)

r = 2; 
n = len(csvdata); 
print(n)

def printCombination(csvdata, n, r): 
    data = [0]*r; 
    print (data)

    combinationUtil(csvdata, data, 0,  
                    n - 1, 0, r); 

def combinationUtil(csvdata, data, start,  
                    end, index, r): 

    if (index == r): 
        for j in range(r): 
            print(data[j], end = " "); 
        print(); 
        return; 

    i = start;  
    while(i <= end and end - i + 1 >= r - index): 
        data[index] = csvdata[i]; 
        combinationUtil(csvdata, data, i + 1,  
                        end, index + 1, r); 
        i += 1; 

printCombination(csvdata, n, r); 

The csvdata prints as

[['red', 'green', 'blue'], ['banana', 'apple', 'orange']]

However, if I manually define an array like so

[1,2,3]

it returns the correct answer. How do I do this with lists ?

Also how would I write the output to a csv ?

Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
Sam
  • 350
  • 5
  • 18

1 Answers1

1

You need to:

  • read each line seperately
  • for each line get the combinations (I am using itertools.combinations below)
  • print the str-representation of the list of that itertoools-generator into one line of your output file (removing ' from it)
  • add a newline
  • do that for all lines

To get to your exact output file, I had to remove the ' signifying strings:

with open ("data.txt","w") as f:
    f.write("red,green,blue\nbanana,apple,orange")

# store each line as list of words seperately    
lines = []
with open("data.txt") as f:
    for l in f:
        l = l.strip() # remove \n
        if l:
            lines.append( list(w.strip() for w in l.split(",")))

print(lines) # [['red', 'green', 'blue'], ['banana', 'apple', 'orange']]

from itertools import combinations

with open("result.txt", "w") as f:
    for l in lines:
        for c in combinations(l,2):
            f.write(str(list(c)).replace("'","")) # remove the ' of strings
        f.write("\n")

print(open("result.txt").read())

Output:

# what was read into `lines` from the file
[['red', 'green', 'blue'], ['banana', 'apple', 'orange']]

# output for 'result.txt' 
[red, green][red, blue][green, blue]
[banana, apple][banana, orange][apple, orange]
Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
  • Wow, this is what I wanted. – Sam Apr 05 '20 at 13:02
  • One more question. If the result.txt contained some duplicates like [red, green][red, blue][green, blue] [apple, orange][red, green][red,blue][red.purple] - Note [red,green] is duplicated. How would I remove the [red,green] or only get the unique value ? – Sam Apr 05 '20 at 13:04
  • 1
    @Sam using the code above you would not get any duplicates unless you got the same values inside your inputs. The code does not take "other input lines" into account - Adjusting this code to remove duplicates values from the inputs could be done using sets or something, but then you would also remove "red,green,blue,red" the second red values - you would have to somehow remove the duped values from the inner lists of `lines` - there are plenty of posts regarding duplicate removal from lists - f.e. [this](https://stackoverflow.com/questions/2213923/removing-duplicates-from-a-list-of-lists) – Patrick Artner Apr 05 '20 at 13:21
  • 1
    you would have to test multiple lines against each other what makes it more difficult - try your best or ask a new question that concentrates on that problem and what you happened to achive yourself after researching it – Patrick Artner Apr 05 '20 at 13:23