1

I have a csv loaded into a panda dataframe. One of the columns contains semi colon delimited list of words like

Beach holiday;Plenty of space;Pool

and I would like to turn this into an array or collection like ["Beach holiday","Plenty of space","Pool"]

Alternatively could create a new column derived from the orginal.

Thank you!

Nick Hurt
  • 11
  • 1

3 Answers3

1

The recommended solution, especially if you have the same number of ; separators in each string, is to create a dataframe of object dtype series, with each element a single string:

df = pd.DataFrame({'A': ['Beach holiday;Plenty of space;Pool',
                         'Mountain holiday;Plenty of grey;Ice']})

df = df['A'].str.split(';', expand=True)

print(df)

                  0                1     2
0     Beach holiday  Plenty of space  Pool
1  Mountain holiday   Plenty of grey   Ice

Creating a series of lists, the alternative, is not recommended; it involves a nested layer of pointers.

jpp
  • 159,742
  • 34
  • 281
  • 339
0

You can do this if you want the headers to be in a list

list(df.columns.values)

or

df[col_name].tolist()

also, check this answer here

Galalen
  • 64
  • 1
  • 7
0

You may check converters

TESTDATA = StringIO("""
 A,B
1,Beach holiday;Plenty of space;Pool
1,Beach holiday;Plenty of space;Pool
    """)
df = pd.read_csv(TESTDATA ,converters={'B':lambda x : x.split(';')})
df
Out[147]: 
    A                                       B
0   1  [Beach holiday, Plenty of space, Pool]
1   1  [Beach holiday, Plenty of space, Pool]
BENY
  • 317,841
  • 20
  • 164
  • 234