1

I read about regex at https://medium.com/tech-tajawal/regular-expressions-the-last-guide-6800283ac034 but am having trouble trying to do something very simple.

s = re.compile('(norm|conv)[.][0-9]')

for k,v in densenet_state_dict.items():
    print(k)
    print(s.findall(k))

It is supposed to print something like norm.2 but it is only detecting norm or conv in my output, not the period nor the digit.

module.features.denseblock4.denselayer16.norm.2.running_mean
['norm']
module.features.denseblock4.denselayer16.norm.2.running_var
['norm']

I even tried '(norm|conv)\.[0-9]'. Am i missing something very important?


EDIT: The minimum working example

module_type = re.compile('(norm|conv)\.[0-9]')
module_name = "module.features.denseblock4.denselayer16.conv.2.weight"
print(module_name)
print(module_type.findall(module_name))

prints

module.features.denseblock4.denselayer16.conv.2.weight
['conv']
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Kong
  • 2,202
  • 8
  • 28
  • 56
  • This could use some additional clarification. What is `densenet_state_dict`? If it has nothing to do with the problem, I suggest removing it to produce a [mcve]. BTW, your second attempt looks pretty good--that should work! – ggorlen Jun 09 '19 at 19:53
  • Well your second try with `(norm|conv)\.[0-9]` is the correct regex. – ALFA Jun 09 '19 at 19:54

2 Answers2

0

Your second regex looks good. If it's not capturing what you want, try:

r'((?:norm|conv)\.[0-9])'

to capture the entire thing (?: is a non-capturing group). Here's an example:

import re

s = """module.features.denseblock4.denselayer16.norm.2.running_mean
['norm']
module.features.denseblock4.denselayer16.norm.2.running_var
['norm']
"""

print(re.findall(r'((?:norm|conv)\.[0-9])', s)) # => ['norm.2', 'norm.2']
ggorlen
  • 44,755
  • 7
  • 76
  • 106
0

This expression might simply extract our desired outputs with a capturing group:

(norm\.[0-9]+|conv\.[0-9]+)

Demo

Test

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"(norm\.[0-9]+|conv\.[0-9]+)"

test_str = ("module.features.denseblock4.denselayer16.norm.2.running_mean\n"
    "module.features.denseblock4.denselayer16.norm.2.running_var\n"
    "module.features.denseblock4.denselayer16.conv.2121.running_mean\n"
    "module.features.denseblock4.denselayer16.conv.21341.running_var")

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Emma
  • 27,428
  • 11
  • 44
  • 69