-2

I know there is plenty of info about regex available but I can’t figure it out somehow.

I have an array1 = ['\n 1.979 \n, \n 1.799 \n'] which looks like this but the numbers vary but are always in this format so the regex = re.compile(r'\d.\d\d\d') which matches perfectly in notepad++ but doesn’t seems to work in python.

import re 
regex = re.compile(r'\d.\d\d\d')
filteredarray= [i for i in array1 if regex.match(i)]

print(filteredarray)

what am I missing? Thanks in Advance

J L
  • 7
  • 2

2 Answers2

0

You can use re.findall(expression, string) to find the required values and convert it into list.

The correct regular expression for your requirement is \d\.?\d{3} or you can also use \d\.\d\d\d

import re

array1 = ['\n 1.979   \n, \n 1.799   \n']

filteredarray = []
for i in array1:
    filteredarray.extend(re.findall('\d\.?\d{3}', i))

print(filteredarray)
Dinesh
  • 812
  • 4
  • 14
0

I think that your pattern \d.\d\d\d is not at start of \n 1.979 \n, \n 1.799 \n. You just replace \d.\d\d\d by ^[\s\S]+\d.\d\d\d.

Details:

  • ^: start of string
  • [\s\S]+: matches any characters, including line breaks.

I also tried test result on python.

import re
array1 = ['\n 1.979   \n, \n 1.799   \n']
regex = re.compile(r'^[\s\S]+\d.\d\d\d')

filteredarray= [i for i in array1 if regex.match(i)]

print(filteredarray)

Result.

['\n 1.979   \n, \n 1.799   \n']
Thân LƯƠNG Đình
  • 3,082
  • 2
  • 11
  • 21