-2

In this python string pattern matching, I want to filter out s1, it should be like *\2017-01-23\ , date string followed by a '\' . Any idea?

s1="historyData\xx\n3_1010366372_2017-01-25_1126807";
s2="historyData\xx\2017-01-23\n3_1010366372_2017-01-25_1126807";
date_reg_exp = re.compile('\d{4}[-/]\d{2}[-/]\d{2}\\');

 mat = re.match(date_reg_exp, s)
      if mat is not None:
        print("not matched")
      else:
        print("matched")
Syam Pillai
  • 4,967
  • 2
  • 26
  • 42
user1615666
  • 3,151
  • 7
  • 26
  • 23

2 Answers2

1

You will have to use a raw string instead of string. Because \xx is not a recognized character.

a = "\xx" will throw ValueError: invalid \x escape

You can try like so:

import re

s1 = r"historyData\xx\n3_1010366372_2017-01-25_1126807"
s2 = r"historyData\xx\2017-01-23\n3_1010366372_2017-01-25_1126807"

s = r"(?:.*?\\)(\d+-\d+-\d+)(?:\\.*)$"
reg = re.compile(s)

print re.match(reg, s1)
print re.match(reg, s2).group(1)

Output:

None
2017-01-23
Mohammad Yusuf
  • 16,554
  • 10
  • 50
  • 78
1

You have to use search instead of match

Here is what doc says

Python offers two different primitive operations based on regular expressions: re.match() checks for a match only at the beginning of the string, while re.search() checks for a match anywhere in the string (this is what Perl does by default).

The strings provided had invalid \x escape .To use them as row string you may use r"string" .The s1 and s2 variables be written as

s1=r"historyData\xx\n3_1010366372_2017-01-25_1126807"
s2=r"historyData\xx\2017-01-23\n3_1010366372_2017-01-25_1126807"

You may re-write the function as follows.

 import re
 def containsDate(s):
        date_reg_exp = re.compile(r'(\d{4}-\d{2}-\d{2})')
        mat = re.search(date_reg_exp,s)
        return mat is not None

Now the functions may be used as follows

s1=r"historyData\xx\n3_1010366372_2017-01-25_1126807"
s2=r"historyData\xx\2017-01-23\n3_1010366372_2017-01-25_1126807"

if containsDate(s1):
    print "match"
else:
    print "no match"