0

Say I have the following 2 dimensional dataframe

+--------+-------------------+------------+
| Index, | Module/Line Item, | Is Module, |
+--------+-------------------+------------+
| 0,     | Module 1,         | True,      |
| 1,     | Line Item 1,      | False,     |
| 2,     | Line Item 2,      | False,     |
| 3,     | Module 2,         | True,      |
| 4,     | Line Item 1,      | False,     |
| 5,     | Line Item 2,      | False      |
+--------+-------------------+------------+

And I want it to turn into this:

+----------+-------------+
| Module   | Line Item   |
+----------+-------------+
| Module 1 | Line Item 1 |
|          | Line Item 2 |
| Module 2 | Line Item 1 |
|          | Line Item 2 |
+----------+-------------+

What would be the best way to accomplish that? Is there a built in way to pivot like that via pandas? I was messing with some regular python but it was very inefficient and I couldn't get it to work right.

Note there are not a set number of Line Items between Modules, and no patterns in the names. The "Is Module" column is the only indicator of whether the value is a module and should be pivoted. All Line items that appear beneath the module until the next module should belong to that module when pivoted.

This is not answered by How to pivot a dataframe because it never explains how to split a column into a hierarchy based on the values that are given in another column.

Cdhippen
  • 615
  • 1
  • 10
  • 32
  • `pd.groupby`, `pd.pivot` – G. Anderson Nov 02 '18 at 22:17
  • 1
    Possible duplicate of [How to pivot a dataframe](https://stackoverflow.com/questions/47152691/how-to-pivot-a-dataframe) – G. Anderson Nov 02 '18 at 22:18
  • This is not answered by How to pivot a dataframe because it never explains how to split a column into a hierarchy based on the values that are given in another column. It only explains how to use a single column as the index instead of another one. I'm not looking to pivot per-se but split a column into a new level based on a boolean value held in column 3. – Cdhippen Nov 04 '18 at 00:54

2 Answers2

1

Your problem isn't pivoting, it's that you have two columns in one column. There may be a more programmatic way to do this, but here's a go at getting it in the right shape and order, then you can group and pivot as necessary.

Create data:

df=pd.DataFrame({'mod_lin':['m1','l1','l2','m2','l1','l2'],'is_mod':[True,False,False,True,False,False]})

    mod_lin is_mod
0   m1      True
1   l1      False
2   l2      False
3   m2      True
4   l1      False
5   l2      False

Make new column for modules, forward fill, remove modules from original column, drop True/False column, reorder columns

df['mods']=np.where(df['is_mod']==True, df['mod_lin'],np.NaN)

df['mods']=df['mods'].fillna(method='ffill')

df=df[df['is_mod']==False]

df.drop('is_mod', axis=1, inplace=True)

df=df[df.columns[::-1]]

    mods    mod_lin
1   m1      l1
2   m1      l2
4   m2      l1
5   m2      l2
G. Anderson
  • 5,815
  • 2
  • 14
  • 21
-1

You can try this:

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.pivot_table.html

With dataframe, you can pivot table.

  • This isn't working for what I want. I want to split a column based on the boolean flag and use the True values as the new index and keep the values under those that are false to the right of those values. – Cdhippen Nov 04 '18 at 00:56