How can I return the first item of a split in a lambda in python?

Question

so I need to split a data Frame column and get the first item to put in a new column with a lambda fuction. I can't figure out how to do that.

df['Reason'] = df['title'].apply(lambda x: x.split(':'))

I'm getting this for now:

df['Reason'].head()

0     [EMS,  BACK PAINS/INJURY]
1    [EMS,  DIABETIC EMERGENCY]
2        [Fire,  GAS-ODOR/LEAK]
3     [EMS,  CARDIAC EMERGENCY]
4             [EMS,  DIZZINESS]

and I'd like:

df['Reason'].head()

0     [EMS]
1     [EMS]
2     [Fire]
3     [EMS]
4     [EMS]

score 2 · Answer 1 · answered Apr 11 '19 at 00:30

2

I am using str.findall with regex here

df.text.str.findall(r"^\w+").str[0]
0     abc
1     foo
2    test
3     NaN
Name: text, dtype: object

answered Apr 11 '19 at 00:30

BENY

317,841
20
164
234

score 1 · Accepted Answer · answered Apr 11 '19 at 00:15

df = pd.DataFrame({'text': ['abc xyz', 'foo bar', 'test', np.nan]})
df

      text
0  abc xyz
1  foo bar
2     test
3      NaN

Use any str method. For example, str.split:

df['text'].str.split(n=1).str[0]

0     abc
1     foo
2    test
3     NaN
Name: text, dtype: object

Or str.partition:

df['text'].str.partition(' ')[0]

0     abc
1     foo
2    test
3     NaN
Name: text, dtype: object

The methods above make working with NaNs easy. apply will fail here:

df['text'].apply(lambda x: x.split(':')[0])
# ---------------------------------------------------------------------------
# AttributeError                            Traceback (most recent call last)
# AttributeError: 'float' object has no attribute 'split'

An isinstance check is the fix for this,

df['text'].apply(lambda x: x.split(None, 1)[0] if isinstance(x, str) else np.nan)

0     abc
1     foo
2    test
3     NaN
Name: text, dtype: object

Hi Man , would you like check this https://stackoverflow.com/questions/55616929/ffill-weird-behavior-when-have-the-duplicate-columns-names? I thought it is bug , just do not know why apply work here.. — BENY, Apr 11 '19 at 00:35

score 1 · Answer 3 · answered Apr 11 '19 at 00:49

1

If you have a column filled with lists, just do straightforwardly

df['Readon'].str[0]

or

df['Readon'].str.get(0)

Outputs

0     [EMS]
1     [EMS]
2     [Fire]
3     [EMS]
4     [EMS]

answered Apr 11 '19 at 00:49

rafaelc

57,686
15
58
82

@Wen-Ben I guess yes. I just used the `df['Reason'].head()` as reference ;p – rafaelc Apr 11 '19 at 01:31

kindall · Answer 4 · 2019-04-11T00:08:05.263

0

Take the first item of the list returned by split():

df['Reason'] = df['title'].apply(lambda x: x.split(':')[0])

For extra credit, tell split() to only split once so that it won't bother splitting more items only to throw them away.

df['Reason'] = df['title'].apply(lambda x: x.split(':', 1)[0])

Or use partition() instead:

df['Reason'] = df['title'].apply(lambda x: x.partition(':')[0])

edited Apr 11 '19 at 00:08

answered Apr 11 '19 at 00:06

kindall

178,883
35
278
309

`df['title'].str.split(n=1).str[0]` – cs95 Apr 11 '19 at 00:07
P.S.: [Avoid the use of `apply` as much as possible.](https://stackoverflow.com/questions/54432583/when-should-i-ever-want-to-use-pandas-apply-in-my-code) – cs95 Apr 11 '19 at 00:08

How can I return the first item of a split in a lambda in python?

4 Answers4