How to combine multiple set of columns in a dataframe to single one?

Question

I have a dataframe as follows

Cycle	A_0	A_1	A_2	A_3	B_0	B_1	B_2	B_3
1	3	4	5	6		1	4	5
1	8	5	3	1	0	8	6	4
2	7	9	1	6	1	0	2	3
3	5		9	1	0	3	8	3

this dataframe has to combined to two column A and B

Expected output

Cycle	A	B
1	3
1	4	1
1	5	4
1	6	5
1	8	0
1	5	8
1	3	6
1	1	4
2	7	1
2	9	0
2	1	2
2	6	3
3	5	0
3	3
3	9	8
3	1	3

What i did?

A = [f"A_{i}" for i in range(20)]
B = [f"B_{i}" for i in range(20)]

df2['A'] = df[A].bfill(axis=1).iloc[:, 0]
df2['B'] = df[B].bfill(axis=1).iloc[:, 0]

This line of code is givng me an output datframe by avoiding the nan. How can i get the expected output?

ADDON

added a new colum to the initial data and expected outcome

[Combine Columns in Pandas - Stack Overflow](https://stackoverflow.com/questions/72233876/combine-columns-in-pandas/72233966) — Ynjxsjmh, May 14 '22 at 06:06

score 1 · Answer 1 · answered May 14 '22 at 06:33

code part

columns = pd.Index(['A_0', 'A_1', 'A_2', 'A_3', 'B_0', 'B_1', 'B_2', 'B_3'], dtype='string')
values = np.array([[ 3.,  4.,  5.,  6., np.nan,  1.,  4.,  5.],
                 [ 8.,  5.,  3.,  1.,  0.,  8.,  6.,  4.],
                 [ 7.,  9.,  1.,  6.,  1.,  0.,  2.,  3.],
                 [ 5., np.nan,  9.,  1.,  0.,  3.,  8.,  3.]],
                dtype=float)
## Or retrive from raw DataFrame if already exists
# columns = df_raw.columns
# values = df_raw.values

## Construct MultiIndex
mi = pd.MultiIndex.from_tuples((s.split("_") for s in columns))

## Construct DataFrame
df = pd.DataFrame(values, columns=mi)

## reshape: stack level=1 (2nd row) of columns to index
df_result = df.stack(level=1)

>>> df_result
       A    B
0 0  3.0  NaN
  1  4.0  1.0
  2  5.0  4.0
  3  6.0  5.0
1 0  8.0  0.0
  1  5.0  8.0
  2  3.0  6.0
  3  1.0  4.0
2 0  7.0  1.0
  1  9.0  0.0
  2  1.0  2.0
  3  6.0  3.0
3 0  5.0  0.0
  1  NaN  3.0
  2  9.0  8.0
  3  1.0  3.0

Explain

Steps:

Construct MultiIndex from flat Index

Pandas provides 4 builtin method to construct MultiIndex; Here use from_tuples form doc: https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.from_tuples.html
- from_arrays :: input [[x1, x2, ...], [y1, y2, ...]] output [[x1, y1], [x2, y2], ...]
- from_tuples :: input [[x1, y1], [x2, y2], ...] output same
- from_frame :: Transfer DataFrames.values to MultiIndex
- from_product :: input like arrays, but zip them to output. e.g. input [[x1, x2], [y1, y2, y3]] output
MultiIndex([('x1', 'y1'), ('x1', 'y2'), ('x1', 'y3'), ('x2', 'y1'), ('x2', 'y2'), ('x2', 'y3')], )
Construct new DataFrame and reshape by stack

See User Guide on reshape/pivot topic: doc: https://pandas.pydata.org/docs/user_guide/reshaping.html

score 1 · Answer 2 · answered May 14 '22 at 07:41

1

You can use pandas.wide_to_long:

(pd.wide_to_long(df.reset_index(), stubnames=['A', 'B'], i=['index','Cycle'], j='x', sep='_')
   .droplevel(['index', 'x'])
 )

Output:

         A    B
Cycle          
1      3.0  NaN
1      4.0  1.0
1      5.0  4.0
1      6.0  5.0
1      8.0  0.0
1      5.0  8.0
1      3.0  6.0
1      1.0  4.0
2      7.0  1.0
2      9.0  0.0
2      1.0  2.0
2      6.0  3.0
3      5.0  0.0
3      NaN  3.0
3      9.0  8.0
3      1.0  3.0

answered May 14 '22 at 07:41

mozway

194,879
13
39
75

got this error: *KeyError: "['index'] not in index"* – mathew May 14 '22 at 11:49
What is the full error? Does your index have a name in the real dataset? Then you need to adapt to use this name instead of "index" – mozway May 14 '22 at 11:55

How to combine multiple set of columns in a dataframe to single one?

2 Answers2

code part

Explain