0

Attention please I want to use "Pandas" only. I do not want to use lambda or NumPy.

I have a Data frame as shown below

import pandas as pd

df = pd.DataFrame({
    "first-name": ["john","peter","john","alex"],
    "height-ft": [6,5,4,6],
    "shape-type": ["null","null","null","null"]
})

I want to apply this

If first-name == john and height-ft == 6 
           return shape-type = good 
else if height-ft == 4
       return shape-type = bad 
else  
       change the shape-type to middle

So the final Dataframe should look like this

  df = ({
        "first-name": ["john","peter","john","alex"],
        "height-ft": [6,5,4,6],
        "shape-type": ["good","middle","bad","middle"]
    })
halfer
  • 19,824
  • 17
  • 99
  • 186
OMID Davami
  • 69
  • 1
  • 11
  • isn't it the same question that you asked two days ago [here](https://stackoverflow.com/questions/61965890/pandas-dataframes-if-else-condition-on-multiple-columns)? – Ben.T May 25 '20 at 23:00
  • Dear @Ben.T Thanks for the answer but this one is different since for the past question I could not get the answer since all the answers used NumPy and the column names were without dash so I could not solve the problem – OMID Davami May 25 '20 at 23:03
  • There is no conceivable way to me that you can have a df and not be able to use numpy. – roganjosh May 25 '20 at 23:05
  • `lambda` just is an unnamed function in Python, and `numpy` is a required library for `pandas`, so these are very odd restrictions to place on answers – Randy May 25 '20 at 23:09

3 Answers3

1
In [183]: df['shape-type'] = "middle"

In [184]: df.loc[(df['first-name'] == 'john') & (df['height-ft'] == 6), 'shape-type'] = "good"

In [185]: df.loc[df['height-ft'] == 4, 'shape-type'] = "bad"

In [186]: df
Out[186]:
  first-name  height-ft shape-type
0       john          6       good
1      peter          5     middle
2       john          4        bad
3       alex          6     middle
Randy
  • 14,349
  • 2
  • 36
  • 42
1

Without numpy, you can do this:

df.loc[(df['first-name'] == 'john') & (df['height-ft'] == 6), 'shape-type'] = 'good'
df.loc[(df['height-ft'] == 4), 'shape-type'] = 'bad'
df.loc[((df['first-name'] != 'john') & (df['height-ft'] != 4)), 'shape-type'] = 'middle'
print(df)

  first-name  height-ft shape-type
0       john          6       good
1      peter          5     middle
2       john          4        bad
3       alex          6     middle

With np.where:

df['shape-type'] = np.where((df['first-name']=='john') & (df['height-ft']==6), 'good', 'middle')
df['shape-type'] = np.where((df['height-ft']==4), 'bad', df['shape-type'])

  first-name  height-ft shape-type
0       john          6       good
1      peter          5     middle
2       john          4        bad
3       alex          6     middle
NYC Coder
  • 7,424
  • 2
  • 11
  • 24
0

you can also do a function using iterrows() that will iterate through all rows of df and then you can apply a function.

import pandas as pd

df = pd.DataFrame({
    "first_name": ["john","peter","john","alex"],
    "height_ft": [6,5,4,6],
    "shape_type": ["null","null","null","null"]
})

print(df)

def define_shape_type(first_name, height_ft) :
    if first_name == 'john' and height_ft == 6 :
        return "good" 
    elif height_ft == 4 :
        return "bad" 
    else :
        return "middle"

for index, row in df.iterrows():
    df.set_value(index, "shape_type", define_shape_type(row.first_name, row.height_ft))

print(df)