-1

I have been going in circles with this question. I am writing a simple Python script in Power BI. I am trying to load a Dataframe and change a particular person's Department in a column based on a conditional column, i.e. if they have a Yes, then the department will change for them for the data. I could really use some feedback as to how I could change the syntax and if my code is correct. Specifically, I am getting a syntax error on the name of the Last Name column. This is what I have:

import pandas as pd

final = pd.DataFrame(dataset.loc[:,'Department','Last Name','Employee Promotion'])

for i in pd.final:
    if i in pd.final.loc['Last Name'] = "Carter" and in pd.final.loc['Employee Promotion'] = True:
        new_department = "Admin"
        pd.final(dataset.loc[:,'Department') = new_department
        pass
    ifelse i in pd.final.loc['Last Name'] = "Litwack" and in pd.final.loc['Employee Promotion'] = True: 
        new_department1 = "OAAS"
        pd.final(dataset.loc['Department') = new_department1
    else pd.final.loc['Department']
jkr
  • 17,119
  • 2
  • 42
  • 68
  • Hello Paul, welcome to stackoverflow, please read this article https://stackoverflow.com/help/how-to-ask – Nicolas Martinez Jul 23 '20 at 13:53
  • 1
    Hello Paul, for future reference, you can use triple backticks to format a block of code, instead of adding single backticks around each individual line. – blackbrandt Jul 23 '20 at 13:57

1 Answers1

0

First off, definitely go through some Python tutorials on Python syntax and if statements. Once you've done that, read some of the Pandas tutorials as well. Those will get you far.

There are several problems in the lines below. First, there is no ifelse statement in Python -- I think you meant elif. Second, = is the assignment operator. For equality, you need to use ==. Also, instead of for i in pd.final:, I think you meant for i in final. Your dataframe is final, not pd.final.

    ifelse i in pd.final.loc['Last Name'] = "Litwack" and in pd.final.loc['Employee Promotion'] = True: 
        new_department1 = "OAAS"
        pd.final(dataset.loc['Department') = new_department1
    else pd.final.loc['Department']

That should help you resolve syntax errors, but there is a better way to do what you are trying to accomplish. Part of the beauty of Pandas is that it allows for vectorized operations, meaning we don't need to use for loops. Below is how one might set a value only for rows that match a condition.

>>> import pandas as pd

>>> df = pd.DataFrame(
        [
            ["ABC", "Carter", True],
            ["ABC", "Carter", False],
            ["ABC", "Litwack", False],
            ["ABC", "Litwack", True],
            ["ABC", "Doe", False],
            ["ABC", "Doe", True]
        ],
        columns = ["Department", "Last Name", "Employee Promotion"])

>>> df
  Department Last Name  Employee Promotion
0        ABC    Carter                True
1        ABC    Carter               False
2        ABC   Litwack               False
3        ABC   Litwack                True
4        ABC       Doe               False
5        ABC       Doe                True

You can filter based on a set of criteria using boolean indexing, as in the answer https://stackoverflow.com/a/15315507/5666087.

>>> mask = (df.loc[:, "Last Name"] == "Carter") & (df.loc[:, "Employee Promotion"])
>>> print(mask)
0     True
1    False
2    False
3    False
4    False
5    False
dtype: bool

mask above is boolean array, where each value indicates whether that row meets our criteria. We can then select the rows that meet our criteria and a column to get values, and we overwrite those values with =.

df.loc[mask, "Department"] = "Admin"
jkr
  • 17,119
  • 2
  • 42
  • 68