I've just started using Python and I'm stuck with a problem related to a dataset I'm working with.
I have the following dataset:
C1 C2 C3 C4 C5 C6
99 069 99002068 3348117 3230802 T6
99 069 99002063 4599974 178885 T4
99 069 99002063 4599974 4606066 T4
99 069 99002063 4599974 236346 T4
99 069 99002063 4599974 310114 T4
I need to group by transpose column C5 into multiple columns based on a group by of columns C1,C2,C3,C4,C6.
The code I've written so far is the following:
# load plugins
import pandas as pd
# import CSV
data = pd.read_csv(
"C:/Users/mcatuogno/Desktop/lista_collegamenti_onb.csv",
sep=";",
header=None,
dtype=str,
usecols=[0, 1, 2, 3, 4, 5],
names=["C1", "C2", "C3", "C4", "C5", "C6"]
)
# sort values
dataSort = data.sort_values(["C1", "C2", "C3", "C4"])
# transpose column based on group by function
dataTranspose = dataSort.groupby(["C1", "C2", "C3", "C4", "C6"])["C5"].apply(list)
With the code above the result is
C1 C2 ... C6 C5
99 000 ... 09900000001100 [102995, 102997, 102996]
99 000 ... 09900000001135 [103042]
I don't know how I can split the column C5 into multiple columns, each with the following name CN_1, CN_2, ..., CN_x.
Which python function can I use?
Thanks in advance!