1

Hi I have 2 dataframes where i have to use 1 dataframe to replace the value in other. I can normally create the dictionary to replace values in whole dataframe but I have a bit different value in other dataframe so i need condition where i can tell if the part of the string is matched then it should map the dictionary. The first dataframe is like this:

enter image description here

The second dataframe is like this:

        id        cars1                     cars2

        1     $ {hQOpelText.r1.val}        BMW
        2     $ {hQOpelText.r2.val}        $ {hQOpelText.r2.val}
        3     $ {hQOpelText.r3.val}        $ {hQOpelText.r5.val}
        4     $ {hQOpelText.r4.val}        Audi
        5     $ {hQOpelText.r5.val}        Audi

And i want resulted df like this:

        id        cars1                     cars2

        1     Opel Adam                   BMW
        2     Opel Astra Estate           Opel Astra Estate
        3     Opel Astra Hatchback        Opel Grandland x 
        4     Opel Astra Saloon           Audi
        5     Opel Grandland x            Audi
s_khan92
  • 969
  • 8
  • 21
  • Does this answer your question? [Remap values in pandas column with a dict](https://stackoverflow.com/questions/20250771/remap-values-in-pandas-column-with-a-dict) – sushanth Jun 08 '20 at 08:38
  • No because i can't map directly the values as i have different values. If i create the dictionary then it will have different values as compared to second data-frame. – s_khan92 Jun 08 '20 at 08:39
  • You can transform ``$ {hQOpelText.r1.val}`` to ``hQOpelTextr1`` if they are going to be consistent, then convert to dict. – sushanth Jun 08 '20 at 08:42

2 Answers2

1

Idea is replace . to empty strings, then extract values by keys of dictionary, map and replace original if no match:

c = df.select_dtypes(object).columns
func = lambda x: (x.str.replace('.', '', regex=False)
                  .str.extract(f'({"|".join(d.keys())})', expand=False)
                  .map(d)
                  .fillna(x))
df[c] = df[c].apply(func)

print (df)
   id                 cars1              cars2
0   1             Opel Adam                BMW
1   2     Opel Astra Estate  Opel Astra Estate
2   3  Opel Astra Hatchback    Opel Grandand X
3   4     Opel Astra Saloon               Audi
4   5       Opel Grandand X               Audi
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thanks for the answer as always :) Can you differentiate between dataframes as you are using everywhere `df` so i am not sure where you are using what? – s_khan92 Jun 08 '20 at 09:04
  • @s_khan92 - not sure if understand, but seems only way is necessary use `df1, df2, df3` instead `df` only if processing different `DataFrame`s and need distingush them. – jezrael Jun 08 '20 at 09:06
1

We can first change all the column values of type $ {hQOpelText.r*.val} in df2 to adhere to the convention of the values used in Variable column in df1 i.e hQOpelTextr*, then we can replace those values from the corresponding values from df1:

cols = df2.select_dtypes(object).columns
df2[cols] = df2[cols].transform(
    lambda s: (
        s.str.replace(r'\$\s*\{([^\.]+).*?([^\.]+).*?\}', r'\g<1>\g<2>')
        .replace(df1.set_index('Variable')['AUS'])
    )
)

# print(df2)
   id                 cars1              cars2
0   1              OpehAdam                BMW
1   2     Opel Astra Estate  Opel Astra Estate
2   3  Opel Astra Hatchback   Opel Grandland X
3   4     Opel Astra Saloon               Audi
4   5      Opel Grandland X               Audi
Shubham Sharma
  • 68,127
  • 6
  • 24
  • 53