0

I am trying to wrap text at certain point e.g. | in a cell and export that to html.

An example:

import pandas as pd
df = pd.DataFrame({'EmployeeId': ['157', '292', '643', '124', '355'],
                     'City': ['Mumbai|Bangalore', 'Pune|Mumbai|Delhi', 'Mumbai|Bangalore', 'Mumbai|Pune', 'Bangalore']})

print(df)
df.to_html('test1.html')

Output:

   EmployeeId      City
0  157        Mumbai|Bangalore 
1  292        Pune|Mumbai|Delhi
2  643        Mumbai|Bangalore 
3  124        Mumbai|Pune      
4  355        Bangalore     

I would have a html file like this (expected):

Image

Output:

   EmployeeId  City
0  157        Mumbai
              Bangalore 
1  292        Pune
              Mumbai
              Delhi
2  ...         ...  

Any help will be much appreciated.

taf66
  • 1
  • 3

3 Answers3

1

Basically the idea would be to use str.split() followed by explode(). Something like the follwoing code should help.

(df.set_index(['EmployeeId']).apply(lambda x:x.str.split('|').explode()).reset_index())   

The output would be like

  EmployeeId       City
0        157     Mumbai
1        157  Bangalore
2        292       Pune
3        292     Mumbai
4        292      Delhi
5        643     Mumbai
6        643  Bangalore
7        124     Mumbai
8        124       Pune
9        355  Bangalore
Amrith Krishna
  • 2,768
  • 3
  • 31
  • 65
  • You may refer to the related question here: https://stackoverflow.com/questions/50731229/splt-cell-into-multiple-rows-in-pandas-dataframe – Amrith Krishna Apr 28 '20 at 13:28
  • (df.set_index(['EmployeeId']).apply(lambda x:x.str.split('|').explode()).reset_index()).groupby(['EmployeeId','City']).count().to_html('test1.html') I think little additional code may help with this example – GIRISH kuniyal Apr 28 '20 at 13:34
1

Let us do

yourdf=df.City.str.split('|').explode().to_frame('City').join(df[df.columns.difference(['City'])])
        City EmployeeId
0     Mumbai        157
0  Bangalore        157
1       Pune        292
1     Mumbai        292
1      Delhi        292
2     Mumbai        643
2  Bangalore        643
3     Mumbai        124
3       Pune        124
4  Bangalore        355
BENY
  • 317,841
  • 20
  • 164
  • 234
0

Many thanks for your helps!

I try this way. I am sure there are better codes.

import pandas as pd # Import the data
df = pd.DataFrame({'EmployeeId': ['157', '292', '643', '124', '355'],
                   'City': ['Mumbai|Bangalore', 'Pune|Mumbai|Delhi', 'Mumbai|Bangalore', 'Mumbai|Pune', 'Bangalore']})
#print(df)

df1 = df["City"].str.split("|", expand=True).stack().reset_index(level=1, drop=True)
#print(df1)

df2 = pd.concat([df, df1], axis=1, sort=False)
#print(df2)

df2 = df2.drop(["City"], axis=1)
#print(df2)

df2.loc[(df2["EmployeeId"].duplicated()  ), ["EmployeeId"]] = ''
df2.columns = ['EmployeeId', 'City New']
print(df2)

df2.to_html('test1.html')

Output:

  EmployeeId   City New
0        157     Mumbai
0             Bangalore
1        292       Pune
1                Mumbai
1                 Delhi
2        643     Mumbai
2             Bangalore
3        124     Mumbai
3                  Pune
4        355  Bangalore

When export it to_html i get this:

Image 1: With Grid

Is there anyway to export it like this without grid (maybe with styles)?

Image 2: Without Grid

taf66
  • 1
  • 3