How to remove leading spaces from strings in a dataseries/list?

Question

I am doing a network analysis via networks and noticed that some of the nodes are being treated differently just because they have extra spaces (leading).

I tried to remove the spaces using the following codes but I cannot seem to make the output become strings again.

rhedge = pd.read_csv(r"final.edge.csv")
rhedge

_________________
 source | to
 niala  | Sana, Sana
 Wacko  | Ana, Aisa

rhedge['to'][1]
'Sana, Sana'

rhedge['splitted_users2'] = rhedge['to'].apply(lambda x:x.split(','))

#I need to split them so they will be included as different nodes

The problem is with the next code

rhedge['splitted_users2'][1]
['Sana', ' Sana']

As you can see the second Sana has a leading space.

I tried to do this:

split_users = []

for i in split:
    row = [x.strip() for x in i]
    split_users.append(row)

pd.Series(split_users)

But when I am trying to split them by "," again, it won't allow me because the dataset is now list. I believe that splitting them would make networks treat them as one node as opposed to creating a different node for one with a leading space.

THANK YOU

Trenton McKinney · Accepted Answer · 2020-05-06T15:31:13.890

Changing the `lambda` expression

import pandas pd

# dataframe creation
df = pd.DataFrame({'source': ['niala', 'Wacko'], 'to': ['Sana, Sana', 'Ana, Aisa']})

# split and strip with a list comprehension
df['splitted_users2'] = df['to'].apply(lambda x:[y.strip() for y in x.split(',')])

print(df['splitted_users2'][0])

>>> ['Sana', 'Sana']

Alternatively

Option 1

Split on ', ' instead of ','

df['to'] = df['to'].str.split(', ')

Option 2

Replace ' ' with '' and then split on ','
This has the benefit of removing any whitespace around either name (e.g. [' Sana, Sana', ' Ana, Aisa'])

df['to'] = df['to'].str.replace(' ', '').str.split(',')

If you want the names split into separate columns, see SO: Pandas split column of lists into multiple columns