1

I am using python regular expression (regex) to find all the matches in any string position (say "GgGAT", frame 1 is Gg.. and frame2 is gG..). I need to used re.finditer method. My code below gives me only "Gg" while I need both "Gg" and "gG";

import re

data="ACGTGgGTT" 
for match in re.finditer(r'GG|gg|Gg|gG', data):
   print (match)
Behmah
  • 137
  • 2
  • 10

3 Answers3

2

You can use regex lookahead using ?=... synthax:

re.finditer(r"(?=(GG|gg|Gg|gG))", data)
Gilad Green
  • 36,708
  • 7
  • 61
  • 95
-1

See if this helps:

import re

data="ACGTGgGTTgGg" 
for match in re.finditer(r'GG|gg|Gg|gG', data):
   for i in range(match.start(),match.end()):
       print (data[i]+data[i+1])

Output:

Gg
gG
gG
Gg
Bhagyesh Dudhediya
  • 1,800
  • 1
  • 13
  • 16
-1
import re
data="ACGTGgGTgGTGGgg" 
matches = re.findall(r'G{2}', data,re.IGNORECASE) # Or re.I
print(matches)
>> ['Gg', 'gG', 'GG', 'gg']

EDIT

import re
data="ACGTGgGTgGTGGgg" 
for match in re.finditer(r'G{2}', data,re.IGNORECASE):
    print(match)
>>> 
<re.Match object; span=(4, 6), match='Gg'>
<re.Match object; span=(8, 10), match='gG'>
<re.Match object; span=(11, 13), match='GG'>
<re.Match object; span=(13, 15), match='gg'>
Rinshan Kolayil
  • 1,111
  • 1
  • 9
  • 14