Python Pandas mlutiplication witihin one column base on a condition in other column

Question

I am trying to multiply quantity defined as column ['TRGQTY'] within different rows in my DataFrame based on the condition defined for other columns. I want to take the ['TRGQTY'] where ['SUBPACKNR'] is not blank and multiply it with ['TRGQTY'] of all rows that containe in column ['BALICI_INSTRUKCE'] == 'PV%+1'(this column express the level -> PV2 = level 2 or PV3 = level 3).

import pandas as pd
df = pd.DataFrame({'DISPO': ['580', '580', '580','580', '580', '580'],
                   'MATNR_FERT': ['04440210721VL', '04440210721VL', '04440210721VL','04440210721VL', '04440210721VL', '04440210721VL'],
                   'STATUS_FERT': ['40', '40', '40','40', '40', '40'],
                   'PACKNR': ['GgZK{OS27kYc{uX22fOQNW', 'GgZK{OS27kYc{uX22fOQNW', 'GgZK{OS27kYc{uX22fOQNW','GgZK{OS27kYc{uX22fOQNW', 'GgZK{OS27kYc{uX22fOQNW', 'GgZK{OS27kYc{uX22fOQNW'],
                   'BALICI_INSTRUKCE': ['PV2', 'PV1', 'PV2','PV1', 'PV1', 'PV2'], 'ČÍSLO_MATERIÁLU': ['6000146128', '6000129044', '6099504147','6000100412', '', '04440210721VL'],
                   'SUBPACKNR': ['', '', '','', 'GgZK{OS27kYc{uX22fOQNW', ''],
                   'TRGQTY': ['1', '1', '1','1', '60', '20'],
                   'BASEUNIT': ['ST', 'ST', 'ST','ST', 'ST', 'ST'] })

data_csv = [[DISPO, MATNR_FERT, STATUS_FERT, PACKNR, BALICI_INSTRUKCE, ČÍSLO_MATERIÁLU, SUBPACKNR, TRGQTY, BASEUNIT], [580, 04440210721VL, 40 ,GgZK{OS27kYc{uX22fOQNW, PV2, 6000146128, 1, ST], [580, 04440210721VL, 40, GgZK{OS27kYc{uX3IyhQNW, PV1, 6000129044, 1, ST], [580, 04440210721VL, 40, GgZK{OS27kYc{uX22fOQNW, PV2, 6099504147, 1, ST], [580, 04440210721VL, 40, GgZK{OS27kYc{uX3IyhQNW, PV1, 6000100412, 1, ST], [580, 04440210721VL, 40, GgZK{OS27kYc{uX3IyhQNW, PV1, GgZK{OS27kYc{uX22fOQNW, 60, ST], [580, 04440210721VL, 40, GgZK{OS27kYc{uX22fOQNW, PV2, 04440210721VL, 20, ST]]

enter image description here

This is my code, but the multiplication does not work correctly.

df_cislodilu = [data_csv]

nasobky = {}


for i in pd.unique(df_cislodilu['MATNR_FERT']):
    df_subset = df_cislodilu[df_cislodilu['MATNR_FERT'] == i]
    multiplicator = df_subset[~df_subset['SUBPACKNR'].isna()] #nefunguje pro výběr násobku
    level = df_subset[df_subset['BALICI_INSTRUKCE'] == 'PV2'] # or 'PV%'+1
    final = level['TRGQTY'].prod()
    nasobky[i] = df_subset['TRGQTY'] * final
    
    nasobky = pd.DataFrame(nasobky)



print(nasobky)

This is the result of this code enter image description here

This is what I need: enter image description here

Sourcecode for more 'MATNR_FERT':

df = pd.DataFrame({ 'DISPO': ['580', '580', '580', '580', '580', '580', '580', '580', '580', '580', '580', '580'], 'MATNR_FERT': ['04440210721VL', '04440210721VL', '04440210721VL', '04440210721VL', '04440210721VL', '04440210721VL', '04440210722A5', '04440210722A5', '04440210722A5', '04440210722A5', '04440210722A5', '04440210722A5'], 'STATUS_FERT': ['40', '40', '40', '40', '40', '40', '40', '40', '40', '40', '40', '40'], 'PACKNR': ['GgZK{OS27kYc{uX22fOQNW', 'GgZK{OS27kYc{uX22fOQNW', 'GgZK{OS27kYc{uX22fOQNW', 'GgZK{OS27kYc{uX22fOQNW', 'GgZK{OS27kYc{uX22fOQNW', 'GgZK{OS27kYc{uX22fOQNW', 'GgZK{OS27kYc{uX22fOQNM', 'GgZK{OS27kYc{uX22fOQNM', 'GgZK{OS27kYc{uX22fOQNM', 'GgZK{OS27kYc{uX22fOQNM', 'GgZK{OS27kYc{uX22fOQNM', 'GgZK{OS27kYc{uX22fOQNM'], 'BALICI_INSTRUKCE': ['PV2', 'PV1', 'PV2', 'PV1', 'PV1', 'PV2', 'PV2', 'PV1', 'PV2', 'PV1', 'PV1', 'PV2'], 'ČÍSLO_MATERIÁLU': ['6000146128', '6000129044', '6099504147', '6000100412', '', '04440210721VL', '6000146128', '6000129044', '6099504147', '6000100412', '', '04440210721VL'], 'SUBPACKNR': ['', '', '', '', 'GgZK{OS27kYc{uX22fOQNW}', '', '', '', '', '', 'GgZK{OS27jcWY}lueMkCeW', ''], 'TRGQTY': ['1', '1', '1', '1', '60', '20', '1', '1', '1', '1', '60', '20'], 'BASEUNIT': ['ST', 'ST', 'ST', 'ST', 'ST', 'ST', 'ST', 'ST', 'ST', 'ST', 'ST', 'ST'] })

Code for multiplication:

enter code here
# ensure having numeric data
df['TRGQTY'] = df['TRGQTY'].astype(int)
#sort values by ['part_number']
df = df.sort_values(by=['part_number'])
#create subset table and operate within each small subset table with same PART_NUMBER
df = df['part_number'].unique()
# is BALICI_INSTRUKCE PV1?
m1 = df['BALICI_INSTRUKCE'].eq('PV1')
# is BALICI_INSTRUKCE PV2/PV3/PV4?
m2 = df['BALICI_INSTRUKCE'].eq('PV2')
m3 = df['BALICI_INSTRUKCE'].eq('PV3')
m4 = df['BALICI_INSTRUKCE'].eq('PV4')
# is SUBPACKNR not an empty string?
m5 = df['SUBPACKNR'].ne('')
# if m1 and m3 are True, get the first value
ref = df.loc[m1&m5, 'TRGQTY'].iloc[0] # 60
# multiply the PV2 by the reference
df.loc[m2|m3|m4, 'TRGQTY'] *= ref
print(df)

please don't use images of data, provide a fully reproducible input and the matching expected output — mozway, Apr 25 '23 at 08:30
agree with @mozway give us a representative input dataframe (made up data, sanitized data ...) and a piece of python that is standalone and we can use to reproduce. Pictures of data are useless : we need the actual data to help you. And of course we have no clue of what is `C:/Users/kbm2bj/Documents/DA_2022_Python/Machine Learning/course/přehození_sloupců.csv"` — LoneWanderer, Apr 25 '23 at 08:39
This is the table I want to transform with python: DISPO MATNR_FERT STATUS_FERT PACKNR BALICI_INSTRUKCE ČÍSLO_MATERIÁLU SUBPACKNR TRGQTY BASEUNIT 580 04440210721VL 40 GgZK{OS27kYc{uX22fOQNW PV2 6000146128 1 ST 580 04440210721VL 40 GgZK{OS27kYc{uX3IyhQNW PV1 6000129044 1 ST 580 04440210721VL 40 GgZK{OS27kYc{uX22fOQNW PV2 6099504147 1 ST 580 04440210721VL 40 GgZK{OS27kYc{uX3IyhQNW PV1 6000100412 1 ST 580 04440210721VL 40 GgZK{OS27kYc{uX3IyhQNW PV1 GgZK{OS27kYc{uX22fOQNW 60 ST 580 04440210721VL 40 GgZK{OS27kYc{uX22fOQNW PV2 04440210721VL 20 ST — Michaela, Apr 25 '23 at 09:11
please [edit] the question with this data, also make sure that you can reproduce the issue from thee data that you post here (if you can't we also won't be able to) — mozway, Apr 25 '23 at 09:14
This is my current code: import pandas as pd df_cislodilu = pd.read_csv("data.csv", encoding='utf-8') nasobky = {} for i in pd.unique(df_cislodilu['MATNR_FERT']): df_subset = df_cislodilu[df_cislodilu['MATNR_FERT'] == i] multiplicator = df_subset[~df_subset['SUBPACKNR'].isna()] level = df_subset[df_subset['BALICI_INSTRUKCE'] == 'PV2'] # or 'PV%'+1 final = level['TRGQTY'].prod() nasobky[i] = df_subset['TRGQTY'] * final nasobky = pd.DataFrame(nasobky) print(nasobky) — Michaela, Apr 25 '23 at 09:14
You should really read [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) — mozway, Apr 25 '23 at 09:15

mozway · Answer 1 · 2023-04-25T11:21:03.600

IIUC, you want:

# ensure having numeric data
df['TRGQTY'] = df['TRGQTY'].astype(int)

# is BALICI_INSTRUKCE PV1?
m1 = df['BALICI_INSTRUKCE'].eq('PV1')
# is BALICI_INSTRUKCE PV2?
m2 = df['BALICI_INSTRUKCE'].eq('PV2')
# is SUBPACKNR not an empty string?
m3 = df['SUBPACKNR'].ne('')

# if m1 and m3 are True, get the first value
ref = df.loc[m1&m3, 'TRGQTY'].iloc[0] # 60

# multiply the PV2 by the reference
df.loc[m2, 'TRGQTY'] *= ref

Output:

  DISPO     MATNR_FERT STATUS_FERT                  PACKNR BALICI_INSTRUKCE ČÍSLO_MATERIÁLU               SUBPACKNR  TRGQTY BASEUNIT
0   580  04440210721VL          40  GgZK{OS27kYc{uX22fOQNW              PV2      6000146128                              60       ST
1   580  04440210721VL          40  GgZK{OS27kYc{uX22fOQNW              PV1      6000129044                               1       ST
2   580  04440210721VL          40  GgZK{OS27kYc{uX22fOQNW              PV2      6099504147                              60       ST
3   580  04440210721VL          40  GgZK{OS27kYc{uX22fOQNW              PV1      6000100412                               1       ST
4   580  04440210721VL          40  GgZK{OS27kYc{uX22fOQNW              PV1                  GgZK{OS27kYc{uX22fOQNW      60       ST
5   580  04440210721VL          40  GgZK{OS27kYc{uX22fOQNW              PV2   04440210721VL                            1200       ST

@Michaela it is possible to simplify the code (if you have only PV1/PV2, then only one mask of m1/m2 is needed), but I preferred to keep it explicit — mozway, Apr 25 '23 at 11:19
Dear @mozway I am trying to perform the same multiplication with more 'MATNR_FERT' values (please see source code and python code edited in question above). I want to sort values with same 'MATNR_FERT' and perform multiplication only on subsets with same 'MATNR_FERT' in whole df. After sorting the values the multiplication is not working unfortunately. — Michaela, May 03 '23 at 06:12

Python Pandas mlutiplication witihin one column base on a condition in other column

1 Answers1