0

I have this dataframe that houses job details and there are columns for Pay, location, Job position, Job Description, other information, Etc.

However sometimes the values in pay might not be present and is available in location for example.

I want to be able to loop through the column in location and update the values in pay but if there is data in pay I want to retain it, only updating the values from location instead.

My first thought is using apply lambda like so but I am unable to put in the else condition that would just allow me to retain the value of pay if location is blank

df['Pay'] = df['Location'].apply(lambda x : x if x !='' else [retain the value of Pay] )

What am I doing wrong?

Thank you for reading

Randy Chng
  • 103
  • 5
  • What does this mean: "However sometimes the values in pay might not be present and is available in location for example." Are you saying that sometimes the pay information is located in the wrong column? Please provide an example DataFrame. Please see [How to ask: Pandas](https://stackoverflow.com/a/20159305/6298712). – ddejohn Sep 17 '21 at 03:18

2 Answers2

1

replace and fillna Method

data = {
    'pay': [2, 4, 6, 8],
    'location': ['', 'MA', 'CA', ''],
}
df = pd.DataFrame(data)

df['pay'] = df.location.replace({'': None}).fillna(df.pay)

np.where Method

df['pay'] = np.where(df.location.ne(''), df.location, df.pay)

Output:

print(df['pay'])
0     2
1    MA
2    CA
3     8
Name: pay, dtype: object
ashkangh
  • 1,594
  • 1
  • 6
  • 9
  • As an aside, is there any resources, books or otherwise you are able to point me to in order to improve my python? thanks – Randy Chng Sep 17 '21 at 08:28
  • There are some websites like Udemy and Coursera. They have pretty decent courses in python, pandas, etc. There are also some introduction books like python crash course for gaining some rudimentary knowledge. After finishing some of these contents, I think you will be able to find your way through more sophisticated and upper level resources. Wish you luck in this lovely journey! – ashkangh Sep 17 '21 at 14:56
0

You can use loc and a boolean mask if NaN then:

df.loc[df["Pay"].isnull(),'Pay'] = df['Location'] 

otherwise for empty string:

df.loc[df["Pay"] == '','Pay'] = df['Location']
Deven Ramani
  • 751
  • 4
  • 10