-1

I currently iterate through the rows of an excel file multiple times and write in "XYZ" to a new column when the row meets certain conditions.

My current code is:

 df["new_column"] = np.where(fn == True, "XYZ", "")

The issue I face is that when the fn == True condition is not satisfied, I want to do absolutely nothing and move onto checking the next row of the excel file. I noticed that each time I iterate, the empty string replaces the "XYZ"s that are already marked in the file. Is there a way to prevent this from happening? Is there something I can do instead of empty string ("") to prevent overwriting?

Edit:

My dataframe is a huge financial Excel file with multiple columns and rows. This data set has columns like quantity, revenue, sales, etc. Basically, I have a list that contains about 50 conditionals. For each condition, I iterate through all the rows in the Excel and for the row that matches the condition, I wanted to put an "XYZ" in the df["new_column"] flagging that row. The df["new_column"] is an added column to the original dataframe. Then, I move onto the next condition up until the 50th conditional.

I think the problem is, is that the way I wrote code replaces the previous existing "XYZ" with empty string when I proceed onto check the other conditionals in the list. Basically, I want to find a way to lock "XYZ" in, so it can't become overwritten.

The fn is a helper function that returns a boolean depending on if the condition equals a row in the dataframe. While I iterate, if the condition matches a row, then this function returns True and marks the df["new_column"] with "XYZ". The helper function takes in multiple arguments to check if the current condition matches any of the rows in the dataframe. I hope this explanation helps!

Brian Kim
  • 65
  • 1
  • 2
  • 8

2 Answers2

1

you can try using a lambda.

first, create the function:

def checkIfTrue(FN, new):
    if new == "":
        pass
    if FN:
        return "XYZ"

than apply this to the new column like that:

df['new_column'] = df.apply(lambda row: checkIfTrue(row["fn"], row["new_column"]), axis=1)
Yuval Raz
  • 96
  • 7
0

IIUC you want to use .loc[]:

df.loc[fn, "new_column"] = 'XYZ'
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419