-1

I am attempting to select part of a python string using re.match:

revenue = "Revenue;;Item,Johnver,Vanston,Danbree,Vansey,Mundyke;Tea,190,140,1926,14,143;Coffee,325,19,293,1491,162;Water,682,14,852,56,659;Milk,829,140,609,120,87;;Expenses;;Item,Johnver,Vanston,Danbree,Vansey,Mundyke;Tea,120,65,890,54,430;Coffee,300,10,23,802,235;Water,50,299,1290,12,145;Milk,67,254,89,129,76;;"
revenue = re.match(r"(?<=Revenue;;).*(?=;E)", file_content)
print(revenue)

but it returns None.

I tested the regular expression on regex101.com, and it gave me the desired match, the text following Revenue;; and preceding ;Expenses:

Item,Johnver,Vanston,Danbree,Vansey,Mundyke;Tea,190,140,1926,14,143;Coffee,325,19,293,1491,162;Water,682,14,852,56,659;Milk,829,140,609,120,87;

Therefore I'm assuming something is wrong with my python implementation, however, I couldn't find any information in the python regex documentation that helped me. Tried with both python 2 and 3.

What could I be doing wrong

3 Answers3

1

Use re.search

Ex:

import re

revenue = "Revenue;;Item,Johnver,Vanston,Danbree,Vansey,Mundyke;Tea,190,140,1926,14,143;Coffee,325,19,293,1491,162;Water,682,14,852,56,659;Milk,829,140,609,120,87;;Expenses;;Item,Johnver,Vanston,Danbree,Vansey,Mundyke;Tea,120,65,890,54,430;Coffee,300,10,23,802,235;Water,50,299,1290,12,145;Milk,67,254,89,129,76;;"
revenue = re.search(r"(?<=Revenue;;).*(?=;E)", revenue)
print(revenue.group())

Output:

Item,Johnver,Vanston,Danbree,Vansey,Mundyke;Tea,190,140,1926,14,143;Coffee,325,19,293,1491,162;Water,682,14,852,56,659;Milk,829,140,609,120,87;
Rakesh
  • 81,458
  • 17
  • 76
  • 113
-1

Per Python documentation

re.match(pattern, string, flags=0)

If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding match object. Return None if the string does not match the pattern; note that this is different from a zero-length match.

So you should probably use either re.search or re.findall

Mia
  • 2,466
  • 22
  • 38
-1

re.match starts matching from the beginning of the string, so a look-behind (?<=...) at the start of a string would never match. As @Rakesh have mentioned, use re.search

Sunitha
  • 11,777
  • 2
  • 20
  • 23