2

I have this column of data in a DataFrame (just shown here in a CSV file):

What I am trying to do is extract all the data within the brackets and also include the brackets so each row looks something like this

[GO:0005524],[GO:0000287],[GO:0004709],[GO:0004674],etc...

This is the code I have so far but I always end up with a blank column:

df['go_molecular_function'] = df['go_molecular_function'].str.extract(r"\((A-Za-z+)\)", expand=False)
accdias
  • 5,160
  • 3
  • 19
  • 31
Inan Khan
  • 91
  • 6
  • 2
    Please, avoid [posting images of text](https://unix.meta.stackexchange.com/questions/4086/psa-please-dont-post-images-of-text). It is a better practice to transcribe them instead. – accdias Dec 15 '21 at 14:52

1 Answers1

0

I figured it out I used this code

df['go_molecular_function'] = df['go_molecular_function'].str.findall(r"(?<=\[)([^]]+)(?=\])")

and ended up with this which I am happy with

Inan Khan
  • 91
  • 6