Regex array python

Question

I know there is plenty of info about regex available but I can’t figure it out somehow.

I have an array1 = ['\n 1.979 \n, \n 1.799 \n'] which looks like this but the numbers vary but are always in this format so the regex = re.compile(r'\d.\d\d\d') which matches perfectly in notepad++ but doesn’t seems to work in python.

import re 
regex = re.compile(r'\d.\d\d\d')
filteredarray= [i for i in array1 if regex.match(i)]

print(filteredarray)

what am I missing? Thanks in Advance

That is a single string in a list (of length 1). The numbers are not of the format you have posted as there are spaces and newlines in the string. Are you sure you have Notepad++ in Regex mode and not Extended mode? — inspectorG4dget, Sep 29 '20 at 12:19
Use [re.search](https://docs.python.org/3/library/re.html#re.search) instead. — The fourth bird, Sep 29 '20 at 12:23
To create an array of the number strings: `regex.findall(array1[0])` — DarrylG, Sep 29 '20 at 12:29
@Thefourthbird TypeError: search() missing 1 required positional argument: 'string' — J L, Sep 29 '20 at 12:34

Dinesh · Answer 1 · 2020-09-29T12:40:13.023

0

You can use re.findall(expression, string) to find the required values and convert it into list.

The correct regular expression for your requirement is \d\.?\d{3} or you can also use \d\.\d\d\d

import re

array1 = ['\n 1.979   \n, \n 1.799   \n']

filteredarray = []
for i in array1:
    filteredarray.extend(re.findall('\d\.?\d{3}', i))

print(filteredarray)

edited Sep 29 '20 at 12:40

answered Sep 29 '20 at 12:34

Dinesh

812
4
14

thank you but why was my regex wrong it also matched fully on regex101 – J L Sep 29 '20 at 12:37
Yours also right, but `[i for i in array1 if regex.match(i)]` this part is wrong. you are assigning the value in array that represents 'i'. As you are assigning the same value in array1, you are getting the same value in array – Dinesh Sep 29 '20 at 12:42
@Dinesh Sounds like you already forgot why you escaped their dot. – Kelly Bundy Sep 29 '20 at 13:09
I've not escaped the dot present in the string. – Dinesh Sep 29 '20 at 13:13
Yes you did (in the regex string). – Kelly Bundy Sep 29 '20 at 13:14
yeah, sorry. I've escaped the dot as `.` must be present in the decimal format – Dinesh Sep 29 '20 at 13:23

score 0 · Answer 2 · answered Sep 29 '20 at 14:43

I think that your pattern \d.\d\d\d is not at start of \n 1.979 \n, \n 1.799 \n. You just replace \d.\d\d\d by ^[\s\S]+\d.\d\d\d.

Details:

^: start of string
[\s\S]+: matches any characters, including line breaks.

I also tried test result on python.

import re
array1 = ['\n 1.979   \n, \n 1.799   \n']
regex = re.compile(r'^[\s\S]+\d.\d\d\d')

filteredarray= [i for i in array1 if regex.match(i)]

print(filteredarray)

Result.

['\n 1.979   \n, \n 1.799   \n']

Regex array python

2 Answers2