0

I apologise for the title, I know it isn't the most helpful. What I'm attempting to do is restructure my data so that each of a given column is given it's own row with certain values carried over from the previous dataframe.

My Data in its current form is something like this:

ColA | ColB | ColC | val1 | val2 | val3
   1 |    2 |    3 | A    | B    | C
   4 |    5 |    6 | D    | E    | F

And I want to restructure it so I get a result like this:

 ColA | ColB | ColC | val
    1 |    2 |    3 | A
    1 |    2 |    3 | B
    1 |    2 |    3 | C
    4 |    5 |    6 | D 
    4 |    5 |    6 | E 
    4 |    5 |    6 | F 

How would I do this?

I know I could go through each row, grab the relevant data and concat a dataframe but I was hoping for a much better alternative

FallingInForward
  • 285
  • 2
  • 4
  • 12
  • 1
    Does this answer your question? [How can I pivot a dataframe?](https://stackoverflow.com/questions/47152691/how-can-i-pivot-a-dataframe) – BeRT2me Jun 15 '22 at 16:42

2 Answers2

2

Given:

   ColA  ColB  ColC val1 val2 val3
0     1     2     3    A    B    C
1     4     5     6    D    E    F

Doing:

df.melt(['ColA', 'ColB', 'ColC'])

Output:

   ColA  ColB  ColC variable value
0     1     2     3     val1     A
1     4     5     6     val1     D
2     1     2     3     val2     B
3     4     5     6     val2     E
4     1     2     3     val3     C
5     4     5     6     val3     F
BeRT2me
  • 12,699
  • 2
  • 13
  • 31
0

I think the following code should solve your problem. I created an example with the sample data you provided. The code involves stacking and merging the dataframes and columns.

df_start = pd.DataFrame()

df_start = df_start.append([[1, 2, 3, "A", "B", "C"], [4, 5, 6, "D", "E", "F"]])
df_start = df_start.rename(columns={0: "ColA", 1: "ColB", 2: "ColC", 3: "val1", 4: "val2", 5: "val3"})
df_start["vals"] = df_start.values.tolist()
df_start["vals"] = df_start["vals"].apply(lambda x: x[3:])

df_ = df_start["vals"].apply(pd.Series).stack().reset_index(level=1, drop=True).to_frame('val')
result_df = pd.merge(df_, df_start, left_index=True, right_index=True)

result_df = result_df.iloc[:, [1, 2, 3, 0]]
result_df
  • `pd.read_clipboard()` is your friend for grabbing data from a question. For example: `df = pd.read_clipboard(sep='\s+\|\s+')` reads OP's original table just fine~ Then if you want to share it programmatically, you can just do `df.to_dict()` – BeRT2me Jun 15 '22 at 17:03
  • @BeRT2me That seems really useful. Thanks for letting me know! – davidmpeletz Jun 15 '22 at 17:11