I have a dataset with one column and several rows per data item (the number of rows per data item is not unique). The data items are differentiated by a line '------------------------------- '
I want to transpose the data to (3) columns. The data should be split by the line '------------------------------- '
Ideally, the first two columns should be the ids and the rest of the text in whichever number of rows per data item should map to one column like id | id | text
I have tried different approaches suggested in SO but still couldn't get the desired output.
import csv
import sys
inp_fname = 'Comments.csv'
out_fname = 'Columned-Data.csv'
def rez(row, size):
rowx = [''] * size
for i in range(0,len(row)):
rowx[i] = row[i]
return rowx
MATCH = "-------------------------------\n"
cols = []
glob = []
with open(inp_fname, 'r', newline='') as in_csvfile, open(out_fname, 'w', newline='') as out_csvfile:
reader = csv.reader(in_csvfile)
writer = csv.writer(out_csvfile)
for line in reader:
if line == MATCH:
glob.append(list(cols))
cols = []
else:
cols.append(line)
MAX = max(map(lambda x: len(x), glob))
#output = list(map(lambda x: rez(x, MAX), glob))
#writer.writerow(output)
print(list(map(lambda x: rez(x, MAX), glob)))
I need to remove the lines '------------------------------- ' and include only 3 rows (id, id, text) for each dataset.