I've currently split my data into CSV files that need all rows of the column "sequence" combined into one string.
Each CSV looks something like this:
1773.csv
ID Order Sequence
1773 1 'AAGG'
1773 2 'TTGG'
1773 3 'GGAA'
1775.csv
ID Order Sequence
1775 1 'GGTT'
1775 2 'AAGT'
1775 3 'TGAA'
1331.csv
ID Order Sequence
1331 1 'CCGT'
1331 2 'CATT'
1331 3 'GTTA'
I need each CSV to merge each sequence row into one value like this:
ID Sequence
1773 'AAGGTTGGGGAA'
Then make a master CSV of all the combined sequences from each CSV file.
Something like this:
ID Sequence
1773 'AAGGTTGGGGAA'
1775 'GGTTAAGTTGAA'
1331 'CCGTCATTGTTA'
I wouldn't worry too much about the order column since the rows are already in order. +Each CSV in the folder is just it's ID
I've found this but it seems to combine all data from all csv files into a single cell/value:
def return_contents(file_name):
with open(file_name) as infile:
reader = csv.reader(infile)
return list(reader)
all_files = os.listdir('C:\\Users\\CAAVR\\Desktop\\res_csv')
combined_output = []
for file in all_files:
data = return_contents('C:\\Users\\CAAVR\\Desktop\\res_csv\\{}'.format(file))
for row in data:
combined_output.extend(row)
with open('csv_out.csv', 'w', newline='') as outfile:
writer = csv.writer(outfile)
writer.writerow(combined_output)
Thanks ahead of time and let me know if you need more info.