7

I have a DataFrame like this

id  val1   val2
0    A      B
1    B      B
2    A      A
3    A      A

And I would like swap values such as:

id  val1   val2
0    B      A
1    A      A
2    B      B
3    B      B

I need to consider that the df could have other columns that I would like to keep unchanged.

EGM8686
  • 1,492
  • 1
  • 11
  • 22

5 Answers5

4

Try stacking, mapping, and then unstacking:

df[['val1', 'val2']] = (
    df[['val1', 'val2']].stack().map({'B': 'A', 'A': 'B'}).unstack())

df
   id val1 val2
0   0    B    A
1   1    A    A
2   2    B    B
3   3    B    B

For a (much) faster solution, use a nested list comprehension.

mapping = {'B': 'A', 'A': 'B'}
df[['val1', 'val2']] = [
    [mapping.get(x, x) for x in row] for row in df[['val1', 'val2']].values]

df
   id val1 val2
0   0    B    A
1   1    A    A
2   2    B    B
3   3    B    B
cs95
  • 379,657
  • 97
  • 704
  • 746
4

You can use pd.DataFrame.applymap with a dictionary:

d = {'B': 'A', 'A': 'B'}

df = df.applymap(d.get).fillna(df)

print(df)

  id val1 val2
0  0    B    A
1  1    A    A
2  2    B    B
3  3    B    B

For performance, in particular memory usage, you may wish to use categorical data:

for col in df.columns[1:]:
    df[col] = df[col].astype('category')
    df[col] = df[col].cat.rename_categories(d)
jpp
  • 159,742
  • 34
  • 281
  • 339
4

Use factorize and roll the corresponding values

def swaparoo(col):
  i, r = col.factorize()
  return pd.Series(r[(i + 1) % len(r)], col.index)

df[['id']].join(df[['val1', 'val2']].apply(swaparoo))

   id val1 val2
0   0    B    A
1   1    A    A
2   2    B    B
3   3    B    B

Alternative gymnastics using the same function. This incorporates the whole dataframe into the factorization.

df.set_index('id').stack().pipe(swaparoo).unstack().reset_index()

Examples

df = pd.DataFrame(dict(id=range(4), val1=[*'ABAA'], val2=[*'BBAA']))

print(
    df,
    df.set_index('id').stack().pipe(swaparoo).unstack().reset_index(),
    sep='\n\n'
)

   id val1 val2
0   0    A    B
1   1    B    B
2   2    A    A
3   3    A    A

   id val1 val2
0   0    B    A
1   1    A    A
2   2    B    B
3   3    B    B

df = pd.DataFrame(dict(id=range(4), val1=[*'AAAA'], val2=[*'BBBB']))

print(
    df,
    df.set_index('id').stack().pipe(swaparoo).unstack().reset_index(),
    sep='\n\n'
)

   id val1 val2
0   0    A    B
1   1    A    B
2   2    A    B
3   3    A    B

   id val1 val2
0   0    B    A
1   1    B    A
2   2    B    A
3   3    B    A

df = pd.DataFrame(dict(id=range(4), val1=[*'AAAA'], val2=[*'BBBB'], val3=[*'CCCC']))

print(
    df,
    df.set_index('id').stack().pipe(swaparoo).unstack().reset_index(),
    sep='\n\n'
)

   id val1 val2 val3
0   0    A    B    C
1   1    A    B    C
2   2    A    B    C
3   3    A    B    C

   id val1 val2 val3
0   0    B    C    A
1   1    B    C    A
2   2    B    C    A
3   3    B    C    A

df = pd.DataFrame(dict(id=range(4), val1=[*'ABCD'], val2=[*'BCDA'], val3=[*'CDAB']))

print(
    df,
    df.set_index('id').stack().pipe(swaparoo).unstack().reset_index(),
    sep='\n\n'
)

   id val1 val2 val3
0   0    A    B    C
1   1    B    C    D
2   2    C    D    A
3   3    D    A    B

   id val1 val2 val3
0   0    B    C    D
1   1    C    D    A
2   2    D    A    B
3   3    A    B    C
piRSquared
  • 285,575
  • 57
  • 475
  • 624
4

You can swap two values efficiently using numpy.where. However, if there are more than two values, this method stops working.

a = df[['val1', 'val2']].values
df[['val1', 'val2']] = np.where(a=='A', 'B', 'A')

   id val1 val2
0   0    B    A
1   1    A    A
2   2    B    B
3   3    B    B

To adapt this keep other values the same, you can use np.select:

c1 = a=='A'
c2 = a=='B'
np.select([c1, c2], ['B', 'A'], a)
user3483203
  • 50,081
  • 9
  • 65
  • 94
4

Using replace : why we need a C here , check this

df[['val1','val2']].replace({'A':'C','B':'A','C':'B'})
Out[263]: 
  val1 val2
0    B    A
1    A    A
2    B    B
3    B    B
BENY
  • 317,841
  • 20
  • 164
  • 234
  • 3
    This is a funny way to go about it :). I once [looked into](https://stackoverflow.com/a/49259581/9209546) that awful `replace` function... so needlessly complex the mind boggles. The [docs](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.replace.html) have this "gem": `You are encouraged to experiment and play with this method to gain intuition about how it works.` – jpp Oct 24 '18 at 16:49
  • @jpp yep ,I am still waiting for them to fix the bug /;-( – BENY Oct 24 '18 at 16:50