Python - Swap values in multiple dataframes

Question

I have a DataFrame like this

id  val1   val2
0    A      B
1    B      B
2    A      A
3    A      A

And I would like swap values such as:

id  val1   val2
0    B      A
1    A      A
2    B      B
3    B      B

I need to consider that the df could have other columns that I would like to keep unchanged.

cs95 · Answer 1 · 2018-10-24T16:26:25.203

Try stacking, mapping, and then unstacking:

df[['val1', 'val2']] = (
    df[['val1', 'val2']].stack().map({'B': 'A', 'A': 'B'}).unstack())

df
   id val1 val2
0   0    B    A
1   1    A    A
2   2    B    B
3   3    B    B

For a (much) faster solution, use a nested list comprehension.

mapping = {'B': 'A', 'A': 'B'}
df[['val1', 'val2']] = [
    [mapping.get(x, x) for x in row] for row in df[['val1', 'val2']].values]

df
   id val1 val2
0   0    B    A
1   1    A    A
2   2    B    B
3   3    B    B

score 4 · Answer 2 · answered Oct 24 '18 at 16:23

You can use pd.DataFrame.applymap with a dictionary:

d = {'B': 'A', 'A': 'B'}

df = df.applymap(d.get).fillna(df)

print(df)

  id val1 val2
0  0    B    A
1  1    A    A
2  2    B    B
3  3    B    B

For performance, in particular memory usage, you may wish to use categorical data:

for col in df.columns[1:]:
    df[col] = df[col].astype('category')
    df[col] = df[col].cat.rename_categories(d)

piRSquared · Answer 3 · 2018-10-24T16:50:52.213

Use factorize and roll the corresponding values

def swaparoo(col):
  i, r = col.factorize()
  return pd.Series(r[(i + 1) % len(r)], col.index)

df[['id']].join(df[['val1', 'val2']].apply(swaparoo))

   id val1 val2
0   0    B    A
1   1    A    A
2   2    B    B
3   3    B    B

Alternative gymnastics using the same function. This incorporates the whole dataframe into the factorization.

df.set_index('id').stack().pipe(swaparoo).unstack().reset_index()

Examples

df = pd.DataFrame(dict(id=range(4), val1=[*'ABAA'], val2=[*'BBAA']))

print(
    df,
    df.set_index('id').stack().pipe(swaparoo).unstack().reset_index(),
    sep='\n\n'
)

   id val1 val2
0   0    A    B
1   1    B    B
2   2    A    A
3   3    A    A

   id val1 val2
0   0    B    A
1   1    A    A
2   2    B    B
3   3    B    B

df = pd.DataFrame(dict(id=range(4), val1=[*'AAAA'], val2=[*'BBBB']))

print(
    df,
    df.set_index('id').stack().pipe(swaparoo).unstack().reset_index(),
    sep='\n\n'
)

   id val1 val2
0   0    A    B
1   1    A    B
2   2    A    B
3   3    A    B

   id val1 val2
0   0    B    A
1   1    B    A
2   2    B    A
3   3    B    A

df = pd.DataFrame(dict(id=range(4), val1=[*'AAAA'], val2=[*'BBBB'], val3=[*'CCCC']))

print(
    df,
    df.set_index('id').stack().pipe(swaparoo).unstack().reset_index(),
    sep='\n\n'
)

   id val1 val2 val3
0   0    A    B    C
1   1    A    B    C
2   2    A    B    C
3   3    A    B    C

   id val1 val2 val3
0   0    B    C    A
1   1    B    C    A
2   2    B    C    A
3   3    B    C    A

df = pd.DataFrame(dict(id=range(4), val1=[*'ABCD'], val2=[*'BCDA'], val3=[*'CDAB']))

print(
    df,
    df.set_index('id').stack().pipe(swaparoo).unstack().reset_index(),
    sep='\n\n'
)

   id val1 val2 val3
0   0    A    B    C
1   1    B    C    D
2   2    C    D    A
3   3    D    A    B

   id val1 val2 val3
0   0    B    C    D
1   1    C    D    A
2   2    D    A    B
3   3    A    B    C

this remind me this question ;-) https://stackoverflow.com/questions/52506862/is-replace-row-wise-and-will-overwrite-the-value-within-the-dict-twice — BENY, Oct 24 '18 at 16:34

user3483203 · Answer 4 · 2018-10-24T16:42:26.030

You can swap two values efficiently using numpy.where. However, if there are more than two values, this method stops working.

a = df[['val1', 'val2']].values
df[['val1', 'val2']] = np.where(a=='A', 'B', 'A')

   id val1 val2
0   0    B    A
1   1    A    A
2   2    B    B
3   3    B    B

To adapt this keep other values the same, you can use np.select:

c1 = a=='A'
c2 = a=='B'
np.select([c1, c2], ['B', 'A'], a)

BENY · Answer 5 · 2018-10-24T16:51:11.017

4

Using replace : why we need a C here , check this

df[['val1','val2']].replace({'A':'C','B':'A','C':'B'})
Out[263]: 
  val1 val2
0    B    A
1    A    A
2    B    B
3    B    B

edited Oct 24 '18 at 16:51

answered Oct 24 '18 at 16:46

BENY

317,841
20
164
234

3

This is a funny way to go about it :). I once [looked into](https://stackoverflow.com/a/49259581/9209546) that awful `replace` function... so needlessly complex the mind boggles. The [docs](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.replace.html) have this "gem": `You are encouraged to experiment and play with this method to gain intuition about how it works.` – jpp Oct 24 '18 at 16:49
@jpp yep ,I am still waiting for them to fix the bug /;-( – BENY Oct 24 '18 at 16:50

Python - Swap values in multiple dataframes

5 Answers5

Examples