81

I have data frames with column names (coming from .csv files) containing ( and ) and I'd like to replace them with _.

How can I do that in place for all columns?

Cedric H.
  • 7,980
  • 10
  • 55
  • 82

3 Answers3

134

Use str.replace:

df.columns = df.columns.str.replace("[()]", "_")

Sample:

df = pd.DataFrame({'(A)':[1,2,3],
                   '(B)':[4,5,6],
                   'C)':[7,8,9]})

print (df)
   (A)  (B)  C)
0    1    4   7
1    2    5   8
2    3    6   9

df.columns = df.columns.str.replace(r"[()]", "_")
print (df)
   _A_  _B_  C_
0    1    4   7
1    2    5   8
2    3    6   9
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • `AttributeError: Can only use .str accessor with string values (i.e. inferred_type is 'string', 'unicode' or 'mixed')` – Seymour Apr 18 '18 at 12:46
  • 3
    @Seymour It means some or all numeric columns, so need `df.columns = df.columns.astype(str).str.replace("[()]", "_")` – jezrael Apr 18 '18 at 13:02
  • you are right. Actually the columns are output of `PeriodIndex` and their type is `pandas.tseries.period.PeriodIndex`. Any idea of how to obtain the same column name but in a string? – Seymour Apr 18 '18 at 13:03
  • 1
    @Seymour - I think need check [this solutions](https://stackoverflow.com/q/34800343). – jezrael Apr 19 '18 at 05:02
  • 2
    Great answer. Thanks. Just curious about why the`'[ ]'`for the `"[()]"` part? I tried it and it doesn't work. Could you tell me what `[ ]` does in this case please? – Bowen Liu Oct 24 '18 at 20:00
  • 1
    @BowenLiu - It means [regex](https://regexone.com/lesson/matching_characters) for matching only `()` – jezrael Oct 25 '18 at 05:18
  • 1
    Thanks a lot. I've seen people writing short and elegant regex that can perform complex tasks. I am trying to learn it. However, there are so many tutorials out there and I got confused. Is the link you gave a good source to read up on the topic? – Bowen Liu Oct 25 '18 at 12:48
  • I find it odd that you don't have to explicitly specify that you're supplying a regex. Seems like that could complicate things for beginners, though I guess there is an `regex` parameter that you can explicitly set to `False` if you're attempting simple pattern replacements. – Charlie G Sep 15 '20 at 14:04
3

Older pandas versions don't work with the accepted answer above. Something like this is needed:

df.columns = [c.replace("[()]", "_") for c in list(df.columns)]
JamesR
  • 613
  • 8
  • 15
0

The square brackets are used to demarcate a range of characters you want extracted. for example:

r"[Nn]ational"

will extract both occurences where we have "National" and "national" i.e it extracts N or n.