0

Using Python regular expressions how can I find all instances of a combination and print each one on a new line?

Example:

import re

x = "a=123,b=123,c=123,d=123,a=456,b=456...etc"
y = re.search('a=(.*?),', x)
print(y)

Trying to get:

123
456
JSimonsen
  • 2,642
  • 1
  • 13
  • 13

2 Answers2

3

The regular expression

First of all, your regular expression is incorrect. You're matching a= followed by any number of characters. This will match the entire string in one go because * is mostly greedy. Instead, you're trying to find any number of letters, an equal sign, and then any number of digits.

[A-Za-z]+=(\d+)  Regular Expression
        +        At least one
[A-Za-z]         (English) letter
         =       An equals sign
          (   )  Group 1
             +   At least one
           \d    digit

Also, use re.findall not re.search.

Then, doing re.findall(r"[A-Za-z]+=(\d+)", x) will give the list of strings, which you can print, parse, whatever.

Also, there might be a better way of doing this: if the data is exactly as you format it, you can just use regular string operations:

a = "a=123,b=456,c=789"
b = a.split(",") # gets ["a=123", "b=456", "c=789"]
c = [E.split("=") for E in b] # gets [["a", "123"], ["b", "456"], ["c", "789"]]

Then, if you want to turn this into a dictionary, you can use dict(c). If you want to print the values, do for E in c: print(E[1]). Etc.

hyper-neutrino
  • 5,272
  • 2
  • 29
  • 50
2

Just use re.findall:

import re
x = "a=123,b=123,c=123,d=123,a=456,b=456...etc"
final_data = re.findall("(?<=a\=)\d+", x)
for i in final_data:
   print(i)

Output:

123
456

This regular expression utilizes a positive look behind to make sure that the digits are part of the a= expression:

\d+: matches all digits until non-numeric character is found (in this case the start of the next expression).

(?<=a\=): searches for a= assignment part of expression and acts as anchor for \d+ regex.

Ajax1234
  • 69,937
  • 8
  • 61
  • 102