5

I have a Pandas dataframe column which has data in rows such as below:

col1     
abc
ab23
2345
fgh67@
8980

I need to create 2 more columns col 2 and col 3 as such below:

 col2    col3
 abc      2345
 ab23      8980
 fgh67@   

I have used str.isnumeric(), but thats not helping me in a dataframe column. can someone kindly help?

sayan_sen
  • 345
  • 4
  • 15
  • 1
    refer to these questions https://stackoverflow.com/questions/35376387/extract-int-from-string-in-pandas https://stackoverflow.com/questions/37683558/pandas-extract-number-from-string – Inder Jul 03 '18 at 05:47

1 Answers1

4

Use str.isnumeric or to_numeric with check non NaNs for boolean mask and filter by boolean indexing:

m = df['col1'].str.isnumeric()
#alternative
#m = pd.to_numeric(df['col1'], errors='coerce').notnull()
df = pd.concat([df.loc[~m, 'col1'].reset_index(drop=True), 
                df.loc[m, 'col1'].reset_index(drop=True)], axis=1, keys=('col2','col3'))
print (df)
     col2  col3
0     abc  2345
1    ab23  8980
2  fgh67@   NaN

If want add new columns to existed DataFrame with align by indices:

df['col2'] = df.loc[~m, 'col1']
df['col3'] = df.loc[m, 'col1']
print (df)
     col1    col2  col3
0     abc     abc   NaN
1    ab23    ab23   NaN
2    2345     NaN  2345
3  fgh67@  fgh67@   NaN
4    8980     NaN  8980

Or without align:

df['col2'] = df.loc[~m, 'col1'].reset_index(drop=True)
df['col3'] = df.loc[m, 'col1'].reset_index(drop=True)
print (df)
     col1    col2  col3
0     abc     abc  2345
1    ab23    ab23  8980
2    2345  fgh67@   NaN
3  fgh67@     NaN   NaN
4    8980     NaN   NaN
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252