pandas pivot table with multiple information in cells

Question

I am not overly familiar with pandas so this may be a dumb question. I was trying to pivot the following data:

df = pd.DataFrame({
      'Country' : ['country1', 'country2', 'country3', 'country4'],
      'Industry' : ['industry1:\$20 \n industry4:\$30', 
                    'industry10:\$100', 
                    'industry3:\$2 \n industry4:\$30 \n industry12:\$10 \n industry1:\$3',
                    'industry1:\$20 \n industry4:\$30'
                   ],})

(the \n come from an excel extraction)

I need to pivot to have the industries as indices and countries as columns. My intuition is that I need to do some kind of "data unpacking" first on the cells that contain multiple information but I am at a loss at how to do it on pandas.

Thanks all. There are some answers below that work well. I continued to search and found some other posts related to this issue (some people call this issue "exploding pandas rows"). In the thread below, someone wrote a general function explode() that is generic and performs well:

Split (explode) pandas dataframe string entry to separate rows

Load the numeric data as a dataframe, and then pass the Country and Industry into the column and index argument respectively. — Demetri Pananos, Jan 05 '18 at 18:56

jezrael · Accepted Answer · 2018-01-06T11:38:53.473

You can use:

set_index by all columns without Industry
split by regex \s+\n\s+ - \s+ is for 1 or more whitespaces
reshape by stack for Series
again split by different separator
double reset_index, first with remove first level
rename columns

df = (df.set_index(['Country'])['Industry']
        .str.split('\s+\n\s+', expand=True)
        .stack()
        .str.split(r':\\\$', expand=True)
        .reset_index(level=1, drop=True)
        .reset_index()
        .rename(columns={0:'Industry', 1:'Val'})
     )   
print (df)
    Country    Industry  Val
0  country1   industry1   20
1  country1   industry4   30
2  country2  industry10  100
3  country3   industry3    2
4  country3   industry4   30
5  country3  industry12   10
6  country3   industry1    3
7  country4   industry1   20
8  country4   industry4   30

pandas pivot table with multiple information in cells

1 Answers1