how do i find text between upper case characters in data frame rows?

Question

I have a DataFrame with character strings of upper and lower case values and I need to extract only the lower case values between strings of 3 upper case values.

I'm using python and pandas to do this but have been unsuccessful. This is what the data looks like:

afklajrwouoivWERvalueineedREWkfjdsl

I think you forgot to include the code you wrote that doesn't produce the correct output. — dfundako, Aug 19 '19 at 18:57
Have a look at [How to make good pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and provide a [mcve] including sample input and output, and code for what you've tried so far — G. Anderson, Aug 19 '19 at 19:02

score 2 · Answer 1 · answered Aug 19 '19 at 19:03

2

Let's try this:

df = pd.DataFrame({'text':['afklajrwouoivWERvalueineedREWkfjdsl']}, index=[0])

df['text'].str.extract('[A-Z]{3}(.+?)[A-Z]{3}')

Output:

valueineed

Note, this gets all characters between 3 uppercased letters.

answered Aug 19 '19 at 19:03

Scott Boston

147,308
15
139
187

vlemaistre · Accepted Answer · 2019-08-20T07:36:24.783

1

You can also use the re package with the same regex :

import re

re.search('[A-Z]{3}(.+?)[A-Z]{3}', s).group()[3:-3]

Output :

valueineed

If there are several occurences you should instead use :

matches = re.finditer('[A-Z]{3}(.+?)[A-Z]{3}',s)
results = [match.group(1) for match in matches]

edited Aug 20 '19 at 07:36

answered Aug 19 '19 at 19:16

vlemaistre

3,301
13
30

1

thank you, please up vote my question so i can use the site again. – sullymon54 Aug 19 '19 at 19:23
Don't forget to accept an answer if one of them solved your problem. It will also give you a little bit of reputation ;) – vlemaistre Aug 20 '19 at 07:40

how do i find text between upper case characters in data frame rows?

2 Answers2