I'm trying to pivot a csv file while keeping a column (dont_pivot) that's not supposed to be pivoted. I have managed to pivot my two columns, but I'm struggling to retain the dont_pivot column. I would then like to output the result to a csv instead of a csv string (so unlike this example: Pivot a CSV string using python without using pandas or any similar library).
In a second step, I need to extract the number that sits between the two underscores from the dont_pivot column. This isn't a problem - it just means that these values are not unique.
The requirement is to only use the standard library.
Input:
dont_pivot,key,value
a_9_bc,x,1
a_9_bc,y,2
a_9_bc,z,3
a_9_bc,p,4
a_9_bc,q,5
b_9_bc,x,11
b_9_bc,y,21
b_9_bc,z,31
b_9_bc,p,41
b_9_bc,q,51
Desired output:
dont_pivot_num,x,y,z,p,q
a_9_bc,1,2,3,4,5
b_9_bc,11,21,31,41,51
I'm happy to then extract 9 and a/b in a second step instead of doing regex within my pivot code:
dont_pivot_letter,dont_pivot_num,x,y,z,p,q
a,9,1,2,3,4,5
a,9,11,21,31,41,51
Current output (as string, but I don't need one single string, but a csv file):
x,y,z,p,q
1,2,3,4,5
11,21,31,41,51
My code:
import csv
import re
with open("myfile.csv", "r") as f:
content = csv.reader(f)
next(content)
#### dont_pivot_num ####
dont_pivot_num = []
lines = []
for row in content:
dont_pivot_num.append(re.search(r"(\d)", row[0]).group(1)) # Can be an extra step once I have my desired csv format
dont_pivot_char.append(re.search(r"\b(\w)", row[0]).group(1)) # Can be an extra step once I have my desired csv format
lines.append(",".join(row[1:]))
lines = [l.replace(" ", "") for l in lines]
#### Pivot csv file ####
cols = ["x", "y", "z", "p", "q"]
csvdata = {k: [] for k in cols}
tempcols = list(cols)
for line in lines:
key, value = line.split(",")
try:
csvdata[key].append(value)
tempcols.remove(key)
except ValueError:
for c in tempcols: # now tempcols has only "missing" attributes
csvdata[c].append("")
tempcols = [c for c in cols if c != key]
for c in tempcols:
csvdata[c].append("")
# Instead of doing this, I'd like to combine dont_pivot_num with csvdata and write individual rows to a csv file
csvfile = ""
csvfile += ",".join(csvdata.keys()) + "\n"
# print(csvfile)
for row in zip(*csvdata.values()):
csvfile += ",".join(row) + "\n"
print(csv)