0

I'm pretty new to Python/Pandas with a majority of my experience being in SQL/SAS. Unfortunately i'm using an old version of Jupyter notebook which doesn't allow me to install the sql package at work.. well without luck currently.

I'm facing the following dataset and trying to do a if-then condition in-conjunction with a group by statement to sum up the groups product limit by HL and Other.

In my SQL mind i would usually do it as if product = 'HL' then limit as HL_LIMIT else limit = BL_LIMIT.

I'm facing great difficulty doing this in Pandas as i'm still relatively new to the language. Hoping someone can guide my on how to complete this problem.

enter image description here

The dummy dataset shows my excel manual calculation that i am expecting but trying to code.

I've been trying to do a for loop statement through 'group name' with a if then statement to create new fields of the HL_LIMIT / BL_LIMIT before completing it with some additional calculations required.

codester_09
  • 5,622
  • 2
  • 5
  • 27
svenvuko
  • 19
  • 6
  • I'm also new to Python and discovered NumPy for highly efficient conditional statements. This video was life-changing: https://www.youtube.com/watch?v=nxWginnBklU&ab_channel=PyGotham2019 – Rycliff Jan 04 '23 at 03:49

2 Answers2

0

import numpy as np

df['limit'] = np.where(df['group name'] == 'HL', df['HL_LIMIT'], df['BL_LIMIT'])

0

If you want output in de-normalized dataframe form, then use following. The logic is explained as comment above each line.

# Categorize values as "HL" and "BL".
df["product"] = df["product"].apply(lambda p: p if p == "HL" else "BL")

# Compute total for each group and product.
df = df.groupby(["group name", "product"]).sum("limit").reset_index()

# Compute total for each group.
df_total = df.groupby("group name").sum("limit").reset_index()
df_total["product"] = "Total"
df = pd.concat([df, df_total]).reset_index(drop=True)

# Transform by group
df = df.pivot(index="product", columns="group name", values="limit").rename_axis("", axis="columns").reset_index()

Output:

  product  company x  company y  company z
0      BL     750000      70000      80000
1      HL     130000     300000     950000
2   Total     880000     370000    1030000

If you want output extactly like your excel, then you need to do some cosmetic transformations:

# Categorize values as "HL" and "BL".
df["product"] = df["product"].apply(lambda p: p if p == "HL" else "BL")

# Compute total for each group and product.
df = df.groupby(["group name", "product"]).sum("limit").reset_index()

# Compute total for each group.
df_total = df.groupby("group name").sum("limit").reset_index()
df_total["product"] = "Total"
df = pd.concat([df, df_total]).reset_index(drop=True)

# Stack each group's block horizontally.
dfs = []
for g in df["group name"].unique():
  df_g = df[df["group name"] == g].drop("group name", axis=1).reset_index(drop=True)
  df_g.columns = [g, ""]
  dfs.append(df_g)

df = pd.concat(dfs, axis=1)

Output:

  company x         company y         company z         
0        BL  750000        BL   70000        BL    80000
1        HL  130000        HL  300000        HL   950000
2     Total  880000     Total  370000     Total  1030000
Azhar Khan
  • 3,829
  • 11
  • 26
  • 32
  • Hi Azhhar, Thank you for the help that worked well. I wanted to ask you for advice on how to tackle the following: In the same dataset, i don't want to count twice the product e.g company a product 'ef'. Is the best way to tackle this additional condition by counting the product column and then setting count = 1? df_count = df["product"].pd.count() df_count == 1 – svenvuko Jan 09 '23 at 12:33
  • I'm sorry, I don't understand the ask - if a product is counted just once then the total/sum will be different and won't match the grand total. Can you create a new question with sample input and expected output? – Azhar Khan Jan 09 '23 at 13:15
  • Hi Azhar, I've posed a new question, please below link. Thank you for your time. https://stackoverflow.com/questions/75079285/how-to-combine-python-loop-with-if-then-statement – svenvuko Jan 11 '23 at 06:56
  • Hi Azhar, posted this question if you have a chance to look at it. Thank You https://stackoverflow.com/questions/75142082/for-loop-with-conditional-statements-not-working-as-expected-pandas – svenvuko Jan 17 '23 at 06:13