I'm somewhat new to python, and for this assignment we were asked to used a single regular expression to solve each prompt. I've finished prompts A-C, but now I'm stuck on prompt D. Here's the prompt:
d. A substitution, using a regular expression, that converts a date in either the format “May 29, 2019” or “May 29 2019” to “29 May 19”.
A valid date format to match has these elements:
•The month must be the common three letter month abbreviation beginning with a capital letter followed by two lower case letters: Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec.
•The day may be one or two digits. It is not necessary to check for a valid day and dates with leading zeros are acceptable such as 03.
•The year is exactly four digits.
•The month and day are separated by one or more spaces. The day and year are also separated by one or more spaces but an optional comma immediately after the day is permitted (no spaces between the day and comma are permitted)
What I'm stuck on: I'm not sure what to put in the r"..." statement (refer to code), with what I have now I get an error "re.error: bad escape \w at position 0", if we could fix the error or find an another way to do it while maintaining the substr = r"..." I would really appreciate it! Thank you!
Note: --my re.compile code works just fine, before I messed with the substring to change the output, it accepted the case. It just didn't convert it as I had not written the conversion string yet. --At the moment how im processing dates isn't very conventional, i plan on working on that after getting something that works.
Code:
import re
d = re.compile(r"^((Jan)\s+[1-31],\s+\d{4})$|"
r"^((Jan)\s+[1-31]\s+\d{4})$|"
r"^((Feb)\s+[1-28],\s+\d{4})$|"
r"^((Feb)\s+[1-28]\s+\d{4})$|"
r"^((Feb)\s+[1-29],\s+\d{4})$|" #ask prof about leap years
r"^((Feb)\s+[1-29]\s+\d{4})$|" #ask prof about leap years
r"^((Mar)\s+[1-31],\s+\d{4})$|"
r"^((Mar)\s+[1-31]\s+\d{4})$|"
r"^((Apr)\s+[1-30],\s+\d{4})$|"
r"^((Apr)\s+[1-30]\s+\d{4})$|"
r"^((May)\s+[1-31],\s+\d{4})$|"
r"^((May)\s+[1-31]\s+\d{4})$|"
r"^((Jun)\s+[1-30],\s+\d{4})$|"
r"^((Jun)\s+[1-30]\s+\d{4})$|"
r"^((Jul)\s+[1-31],\s+\d{4})$|"
r"^((Jul)\s+[1-31]\s+\d{4})$|"
r"^((Aug)\s+[1-31],\s+\d{4})$|"
r"^((Aug)\s+[1-31]\s+\d{4})$|"
r"^((Sep)\s+[1-30],\s+\d{4})$|"
r"^((Sep)\s+[1-30]\s+\d{4})$|"
r"^((Oct)\s+[1-31],\s+\d{4})$|"
r"^((Oct)\s+[1-31]\s+\d{4})$|"
r"^((Nov)\s+[1-30],\s+\d{4})$|"
r"^((Nov)\s+[1-30]\s+\d{4})$|"
r"^((Dec)\s+[1-31],\s+\d{4})$|"
r"^((Dec)\s+[1-31]\s+\d{4})$")
subStr = r"\w\s\d{1,2}\s\d{4}"
print("----Part d tests that match (and should change):")
print(d.sub(subStr, "May 29, 2019"))
print("----Part d tests that match (and should remain unchanged):")
print(d.sub(subStr, "May 29 19"))
Expected output:
----Part d tests that match (and should change):
May 29 19
----Part d tests that match (and should remain unchanged):
May 29 19
Actual output(if i left the substring blank, and how it currently is):
Blank:
----Part d tests that match (and should change):
May 29, 2019
----Part d tests that match (and should remain unchanged):
May 29 19
--------------------------------
Current:
----Part d tests that match (and should change):
this = chr(ESCAPES[this][1])
KeyError: '\\w'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/Xavier/PycharmProjects/hw7/hw7.py", line 101, in <module>
print(d.sub(subStr, "May 29, 2019"))
File "C:\Users\Xavier\AppData\Local\Programs\Python\Python37\lib\re.py", line 309, in _subx
template = _compile_repl(template, pattern)
File "C:\Users\Xavier\AppData\Local\Programs\Python\Python37\lib\re.py", line 300, in _compile_repl
return sre_parse.parse_template(repl, pattern)
File "C:\Users\Xavier\AppData\Local\Programs\Python\Python37\lib\sre_parse.py", line 1024, in parse_template
raise s.error('bad escape %s' % this, len(this))
re.error: bad escape \w at position 0