Pandas: change between mean/std and plus/minus notations

Question

Let's assume I have a Pandas's DataFrame:

import numpy as np
import pandas as pd

df = pd.DataFrame(
    np.random.randint(0, 100, size=(10, 4)), columns=('A', 'DA', 'B', 'DB'))

which outputs:

    A  DA   B  DB
0  62  87  10  39
1  56   7  81  12
2  37  26  21  44
3  56  26  42  32
4  29  45  11   9
5  11  85   4  79
6  87  31  61  90
7   5  55  26  47
8  55  94  20  84
9  52  26  72  19

I would like to convert that to this:

       A      B
0  62±87  10±39
1   56±7  81±12
2  37±26  21±44
3  56±26  42±32
4  29±45   11±9
5  11±85   4±79
6  87±31  61±90
7   5±55  26±47
8  55±94  20±84
9  52±26  72±19

and viceversa.

I could do this "by hand" but I was hoping for an elegant way using Pandas' built-ins, which could eventually be converted elegantly to LaTeX (i.e. 62±87 becomes $62 \pm 87$ ).

I was looking into .apply() from Converting a column within pandas dataframe from int to string but it is not clear to me how to use it for this purpose.

EDIT

The suggested answer do NOT seem to cover the VICEVERSA: i.e. convertin from the A±DA notation back to two columns A and DA.

Possible duplicate of [Combine two columns of text in dataframe in pandas/python](https://stackoverflow.com/questions/19377969/combine-two-columns-of-text-in-dataframe-in-pandas-python) — Anton vBR, Oct 05 '17 at 11:53
Thanks for pointing to that, but it does not seem cover going from the `±` notation back to two columns. — norok2, Oct 05 '17 at 11:58

Zero · Accepted Answer · 2017-10-05T12:12:44.517

Here's one way

In [1336]: (df.groupby(df.columns.str[-1], axis=1)
              .apply(lambda x: x.astype(str).apply('±'.join, 1)))
Out[1336]:
       A      B
0  62±87  10±39
1   56±7  81±12
2  37±26  21±44
3  56±26  42±32
4  29±45   11±9
5  11±85   4±79
6  87±31  61±90
7   5±55  26±47
8  55±94  20±84
9  52±26  72±19

Another way

In [1351]: pd.DataFrame({c: df.filter(like=c).astype(str).apply('±'.join, 1) 
                         for c in df.columns.str[-1].unique()})
Out[1351]:
       A      B
0  62±87  10±39
1   56±7  81±12
2  37±26  21±44
3  56±26  42±32
4  29±45   11±9
5  11±85   4±79
6  87±31  61±90
7   5±55  26±47
8  55±94  20±84
9  52±26  72±19

Or, also as

In [1386]: pd.DataFrame({c: ['±'.join(v) for v in df.filter(like='A').astype(str).values]
      ...:               for c in df.columns.str[-1].unique()})

And, opposite assuming dff is your string joined dataframe

In [1357]: pd.concat([dff[c].str.split('±', expand=True).rename(columns={0:c, 1:'D'+c})
                      for c in dff.columns], axis=1)
Out[1357]:
    A  DA   B  DB
0  62  87  10  39
1  56   7  81  12
2  37  26  21  44
3  56  26  42  32
4  29  45  11   9
5  11  85   4  79
6  87  31  61  90
7   5  55  26  47
8  55  94  20  84
9  52  26  72  19

Details

In [1358]: df
Out[1358]:
    A  DA   B  DB
0  62  87  10  39
1  56   7  81  12
2  37  26  21  44
3  56  26  42  32
4  29  45  11   9
5  11  85   4  79
6  87  31  61  90
7   5  55  26  47
8  55  94  20  84
9  52  26  72  19

In [1359]: dff
Out[1359]:
       A      B
0  62±87  10±39
1   56±7  81±12
2  37±26  21±44
3  56±26  42±32
4  29±45   11±9
5  11±85   4±79
6  87±31  61±90
7   5±55  26±47
8  55±94  20±84
9  52±26  72±19

Helpers

In [1377]: df.columns.str[-1]
Out[1377]: Index([u'A', u'A', u'B', u'B'], dtype='object')

In [1378]: df.columns.str[-1].unique()
Out[1378]: Index([u'A', u'B'], dtype='object')

This solves it neatly. If I understood correctly `df.columns.str[-1].unique()` picks up all the columns ending with the different characters. It would be nice to explain that. Eventually "grouping" columns that do not start with `D` or even `Δ` is probably more useful. — norok2, Oct 05 '17 at 12:11

Anton vBR · Answer 2 · 2017-10-05T12:48:48.940

0

Here I found a bunch of them so possible a duplicate: Combine two columns of text in dataframe in pandas/python

This one convinced me the most:

import io
import pandas as pd

string = """A,DA,B,DB
62,87,10,39"""

df = pd.read_csv(io.StringIO(string),sep=",")

cols = [i for i in df.columns if len(i) == 1]

for i in cols:
    df[i] = df[i].astype(str)+ "±" + df["D"+i].astype(str)

df[cols]

edited Oct 05 '17 at 12:48

answered Oct 05 '17 at 11:53

Anton vBR

18,287
5
40
46

This does not extend easily to many columns requiring to be combined, though. – norok2 Oct 05 '17 at 12:12
@norok2 that wasn't the original question, but yes.. you had a point, however with some small modifications I made it work. Keep it simple I'd say :). The solution above works too ofc. – Anton vBR Oct 05 '17 at 12:49

Pandas: change between mean/std and plus/minus notations

2 Answers2