I am working with a dataframe created by importing a .csv file I created. I want to (1) create a new column in the dataframe and (2) use values from an existing column to assign a value to the new column. This is an example of what I'm working with:
date id height gender
dd/mm/yyyy 1A 6 M
dd/mm/yyyy 2A 4 F
dd/mm/yyyy 1B 1 M
dd/mm/yyyy 2B 7 F
So I want to make a new column "side" and make that side have the value "A" or "B" based on the existing "id" column value:
date id height gender side
dd/mm/yyyy 1A 6 M A
dd/mm/yyyy 2A 4 F A
dd/mm/yyyy 1B 1 M B
dd/mm/yyyy 2B 7 F B
I have gotten to a point where I have been able to make the new column and assign a new value but when I attempt to use the .groupby
method on the "side" column it doesn't work as expected.
df = pd.read_csv("clean.csv")
df = df.drop(["Unnamed: 0"], axis=1)
df["side"] = ""
df.columns = ["date", "id", "height", "gender", "side"]
for i, row in df.iterrows():
if "A" in row["id"]:
df.at[i, row["side"]] = "A"
else:
df.at[i, row["side"]] = "B"
df["side"]
calling df["side"]
results in blank output, but calling df
by itself produces this:
So there is a value in the dataframe, but using the .groupby method treats the values in the side column as not existing. This is a real headscratcher. I'm new to Python and would appreciate if someone could explain to me what I'm doing wrong.