1

I have multiple strings defined as the following:

"Conv2D(filters=8, kernel_size=(2, 2), strides=(1,1), padding='valid', data_format='channels_last', activation='relu', use_bias=True, kernel_initializer='zeros', bias_initializer='zeros', kernel_regularizer=regularizers.l1_l2(l1=0.01,l2=0.01), bias_regularizer=regularizers.l1_l2(l1=0.01,l2=0.01), activity_regularizer=regularizers.l1_l2(l1=0.01,l2=0.01), kernel_constraint=max_norm(2.), bias_constraint=max_norm(2.), input_shape=(28,28,1))"

I want to extract the value of kernel_size in the string for which I tried the following thing:

match = re.search(i+'(.+?), (.+?) ',value)

where i = 'kernel_size' and and value is the string defined above.

When I run this, I get

<regex.Match object; span=(18, 38), match='kernel_size=(2, 2), '>

I also run the following command to get the value using the above match:

filters = match.group(1).split("=")[1].strip()

but I get this:

kernel_size (2

How can I get something like this:

kernel_size (2,2)
Emma
  • 27,428
  • 11
  • 44
  • 69
Ashutosh Mishra
  • 183
  • 1
  • 1
  • 10

2 Answers2

1

This expression might likely return that:

kernel_size\s*=\s*\(\s*(\d+)\s*,\s*(\d+)\s*\)

with which we would just extract our desired digits using two capturing groups, then we would assemble them back to any format that we wish to output, such as kernel_size (2,2).

Test with re.findall

import re

regex = r"kernel_size\s*=\s*\(\s*(\d+)\s*,\s*(\d+)\s*\)"

test_str = ("Conv2D(filters=8, kernel_size=(2, 2), strides=(1,1), padding='valid',\n"
    "Conv2D(filters=8, kernel_size=( 10  , 20 ), strides=(1,1), padding='valid',")

matches = re.findall(regex, test_str, re.IGNORECASE)

for match in matches:
    print('kernel_size ('+ match[0]+','+match[1]+')')

Output

kernel_size (2,2)
kernel_size (10,20)

The expression is explained on the top right panel of this demo if you wish to explore/simplify/modify it.

RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Emma
  • 27,428
  • 11
  • 44
  • 69
0

re is much slower than regular string operation in python (See What's a faster operation, re.match/search or str.find? for example).

If you only need to get 1 value from the string, it is faster and probably simpler to use string.find

s = '<your string>'

pattern = 'kernel_size=('
p = s.find(pattern)
if p != -1:
    p += len(pattern)
    print('kernel_size (%s)' % s[p:s.find(')', p)])
Okto
  • 96
  • 3