-2

Hello I'm playing around with regular expressions in pythex and i'm having troubles. i'm trying to take the following string

"RANDOM 1 ABBBABBBA SDFSBSBS WBWBSBW WBWBWBWB 10 EBEBEBEB EHRHSHSD EBWBBSHSHSB //"

and grab all the non-numeral characters between RANDOM and the forwards slashes. How do I do this with regular expressions? Help!

Sean Breckenridge
  • 1,932
  • 16
  • 26
Rdrr
  • 103
  • 4
  • So you want to ignore the numerical characters but still grab everything else between RANDOM and //? Is it possible for you to use two regex? – xtj7 Mar 04 '18 at 22:55
  • two regex? as in two regex commands? i don't see why not – Rdrr Mar 04 '18 at 23:28

2 Answers2

1

This is a possible solution:

import re

s = 'RANDOM 1 ABBBABBBA SDFSBSBS WBWBSBW WBWBWBWB 10 EBEBEBEB EHRHSHSD EBWBBSHSHSB //'

pattern = r'(?<=RANDOM).*?(?=//)'
match = re.search(pattern, s)
textBetween = match.group(0)
notNumeric = re.sub(r'\d', '', textBetween)

print(notNumeric) 
  • (?<=RANDOM): looks for text preceded by RANDOM (lookbehind assertion).
  • (?=//): looks for text followed by // (lookahead assertion).
0
import re
text = 'RANDOM 1 ABBBABBBA SDFSBSBS WBWBSBW WBWBWBWB 10 EBEBEBEB EHRHSHSD EBWBBSHSHSB //'

for between_text in re.findall(r'(?<=RANDOM)(.+?)(?=\/\/)', text):
    for word_match in re.findall(r'\b[^\d\W]+\b', between_text):
        print(word_match)

Output:

ABBBABBBA
SDFSBSBS
WBWBSBW
WBWBWBWB
EBEBEBEB
EHRHSHSD
EBWBBSHSHSB

(?<=RANDOM)(.+?)(?=\/\/) :

(?<=RANDOM) is positive lookbehind, it matches the RANDOM before the text, (.+?) matches all the text in between and (?=\/\/) is positive lookahead, which matches the two \/\/'s. More about (.+?)(?=\/\/).

\b[^\d\W]+\b :

\b matches word boundaries, and [^\d\W]+ is a negated set that matches digits and non-words (so it matches non-digits, and words); the + signifies it matches one or more characters.

Sean Breckenridge
  • 1,932
  • 16
  • 26