multiply 2 columns in 2 dfs if they match the column name

Question

I have 2 dfs with some similar colnames.

I tried this, it worked only when I have nonrepetitive colnames in national df.

out = {}

for col in national.columns:
    for col2 in F.columns: 
        if col == col2:
            out[col] = national[col].values * F[col2].values

I tried to use the same code on df where it has several names, but I got the following error 'shapes (26,33) and (1,26) not aligned: 33 (dim 1) != 1 (dim 0)'. Because in the second df it has 33 columns with the same name, and that needs to be multiplied elementwise with one column for the first df.

This code does not work, as there are repeated same colnames in urban.columns.

[np.matrix(urban[col].values) * np.matrix(F[col2].values) for col in urban.columns for col2 in F.columns if col == col2]

Reproducivle code

df1 = pd.DataFrame({ 
    'Col1': [1, 2, 1, 2, 3], 
    'Col2': [2, 4, 2, 4, 6],
    'Col2': [7, 4, 2, 8, 6]})

df2 = pd.DataFrame({
    'Col1': [1.5, 2.0, 3.0, 5.0, 10.0],
    'Col2': [1, 0.0, 4.0, 5.0, 7.0})

if the dataframes have different shapes, you cannot multiply across dataframes like this . Best to use `.merge` to bring the columns in from the other dataframe and multiply them together. — David Erickson, Aug 03 '21 at 21:40
Boths dfs have the same length 1 col 26 rows. The problem is in F df there is only unique cols (i.e, countries currency conversion), wile in the urban.df there is several same name cols 33 but with 26 rows (i.g local currency for each city on a given country). — Mohamed Hachaichi, Aug 03 '21 at 21:49

score 1 · Answer 1 · answered Aug 03 '21 at 21:53

1

You can't multiply two DataFrame if they have different shapes but if you want to multiply it anyway then use transpose:

out = {}

for col in national.columns:
    for col2 in F.columns: 
        if col == col2:
            out[col] = national[col].values * F[col2].T.values

answered Aug 03 '21 at 21:53

Sudhanshu

704
1
9
24

I need to multiplyelement wise F["CIV"] by Urban["CIV"]. Knowing that urban df have 33 columns named "CIV". And I need to produce 33 columns named "CIV". And each columa should be multiplied elementwise with the F["CIV"] – Mohamed Hachaichi Aug 03 '21 at 22:01

score 1 · Accepted Answer · answered Aug 03 '21 at 22:04

1

Hopefully the below working example helps. Please provided a minimum reproducible example in your question with input code and desired output like I have provided. Please see how to ask a good pandas question:

df1 = pd.DataFrame({ 
    'Product': ['AA', 'AA', 'BB', 'BB', 'BB'],
    'Col1': [1, 2, 1, 2, 3], 
    'Col2': [2, 4, 2, 4, 6]})
print(df1)
df2 = pd.DataFrame({
    'FX Rate': [1.5, 2.0, 3.0, 5.0, 10.0]})
print(df2)

df1 = df1.reset_index(drop=True)
df2 = df2.reset_index(drop=True)

for col in ['Col1', 'Col2']:
    df1[col] = df1[col] * df2['FX Rate']
df1
(df1)
  Product  Col1  Col2
0      AA     1     2
1      AA     2     4
2      BB     1     2
3      BB     2     4
4      BB     3     6

(df2)
   FX Rate
0      1.5
1      2.0
2      3.0
3      5.0
4     10.0

Out[1]: 
  Product  Col1  Col2
0      AA   1.5   3.0
1      AA   4.0   8.0
2      BB   3.0   6.0
3      BB  10.0  20.0
4      BB  30.0  60.0

answered Aug 03 '21 at 22:04

David Erickson

16,433
2
19
35

I added a reproducible code ising your base code. So basicly, I want the same df.shape as df1, but each colomn is multiplied by df2 respective column. – Mohamed Hachaichi Aug 03 '21 at 22:28
@Mohamed-IslemHachaichi just change `df1[col] * df2['FX Rate']` to `df1[col] * df2[col]`. Also make sure to manually or automatically define the columns you want to loop through in this line of code: `for col in ['Col1', 'Col2']:` OR `for col in df.columns[df.columns.str.contains('Col')]` which takes all columns that contain the word `Col`. – David Erickson Aug 03 '21 at 23:04
F.columns = ['ALB', 'DZA', 'BGD', 'BWA', 'BGR', 'CHN', 'CRI', 'CIV', 'GEO', 'MYS', 'MDV', 'MEX', 'MNE', 'MAR', 'NAM', 'ASM', 'ZAF', 'LKA', 'TUN', 'UKR', 'VNM', 'ZMB', 'ZWE', 'EGY'] urban.columns = ['CIV', 'MDV', 'CIV', 'MEX', 'DZA', 'LKA', 'CHN', 'LKA', 'MDV', 'LKA', ... 'MEX', 'ALB', 'CIV', 'CHN', 'CIV', 'NRU', 'MEX', 'CHN', 'MEX', 'NAM'] – Mohamed Hachaichi Aug 03 '21 at 23:27
see than CIV (Cote d'Ivoire) is repeated 33 times, CHN (China) is repeated ~40 times (these repetition represents cities within the host country). I can't choose .str.conaitns as all the columns that are in F or (df1) are present in urban or (df2) but some are one columns (for instance F['DZA'], and urban['DZA']), the others are several columns F['CHN'], but urban['CHN'] * 40 – Mohamed Hachaichi Aug 03 '21 at 23:27

SeaBean · Answer 3 · 2021-08-03T22:55:38.173

1

You can get the common columns of the 2 dataframes, then multiply the 2 dataframe by simple multiplication. Then, join back the only column(s) in df1 to the multiplication result, as follows:

common_cols = df1.columns.intersection(df2.columns)
df1_only_cols = df1.columns.difference(common_cols)

df1_out = df1[df1_only_cols].join(df1[common_cols] * df2[common_cols])
df1 = df1_out.reindex_like(df1)

Demo

df1 = pd.DataFrame({ 
    'Product': ['AA', 'AA', 'BB', 'BB', 'BB'],
    'Col1': [1, 2, 1, 2, 3], 
    'Col2': [2, 4, 2, 4, 6],
    'Col3': [7, 4, 2, 8, 6]})

df2 = pd.DataFrame({
    'Col1': [1.5, 2.0, 3.0, 5.0, 10.0],
    'Col2': [1, 0.0, 4.0, 5.0, 7.0})

common_cols = df1.columns.intersection(df2.columns)
df1_only_cols = df1.columns.difference(common_cols)
df1_out = df1[df1_only_cols].join(df1[common_cols] * df2[common_cols])
df1 = df1_out.reindex_like(df1)

print(df1)

  Product  Col1  Col2  Col3
0      AA   1.5   2.0     7
1      AA   4.0   0.0     4
2      BB   3.0   8.0     2
3      BB  10.0  20.0     8
4      BB  30.0  42.0     6

edited Aug 03 '21 at 22:55

answered Aug 03 '21 at 22:47

SeaBean

22,547
3
13
25

there is not 'Col3'. there are 2 cols named 'Col2', and 'Col2' – Mohamed Hachaichi Aug 03 '21 at 22:59
@Mohamed-IslemHachaichi It's not a good practice to have 2 columns with the same column labels. Running the sample code to define `df1` above also will produce only one column of `col2`. – SeaBean Aug 03 '21 at 23:02
But the col names are Counties iso3. It is the only way I can keep track of the vectors :( is there any alternative of using suffix maybe? – Mohamed Hachaichi Aug 03 '21 at 23:05
@Mohamed-IslemHachaichi If the columns have the same name, how do you distinguish which column refer to which particular set of information ? Right, one feasible way is to use suffix. – SeaBean Aug 03 '21 at 23:10
Just for clarity, each column represents a city. each column names represent the host-country iso3. because cities within the same country have the same currency, i need to multiply how much F['for a given country'] can affect each city nrow (0, 26) are actually final consumption categories. Going back to the issue, how do you suggest to use suffix please? – Mohamed Hachaichi Aug 03 '21 at 23:15

score 0 · Answer 4 · answered Aug 04 '21 at 00:04

0

A friend of mine sent this solution wich works just as i wanted.

out = urban.copy() 

for col in urban.columns:
    for col2 in F.columns:
        if col == col2:
            out.loc[:,col] = urban.loc[:,[col]].values * F.loc[:,[col2]].values

answered Aug 04 '21 at 00:04

Mohamed Hachaichi

61
6

multiply 2 columns in 2 dfs if they match the column name

4 Answers4