1

Attempting to ignore first and third occurrence of match and replace all date information. Here is trails which is not working.

Input = '''
Captured on: 08-29-2023 09:43:26
Start Time: 08-28-2023 11:11:57
1St Cycle Time: 08-28-2023 11:12:06 

Channel-00 Cycles Completed: 4404 
Channel-00 Triggered Time: 08-29-2023 00:45:08 
Channel-01 Cycles Completed: 24 
Channel-01 Triggered Time: 08-28-2023 11:15:57'''

stra = re.sub("^(?!Captured.*$).*?(\d+-\d+-2023 )", "", Input)

Expecting output:

Captured on: 08-29-2023 09:43:26
Start Time: 11:11:57
1St Cycle Time: 08-28-2023 11:12:06 

Channel-00 Cycles Completed: 4404 
Channel-00 Triggered Time: 00:45:08 
Channel-01 Cycles Completed: 24 
Channel-01 Triggered Time: 11:15:57
wjandrea
  • 28,235
  • 9
  • 60
  • 81
OO7
  • 350
  • 2
  • 12
  • 1
    First of all, you need the `re.MULTILINE` flag so that `^` and `$` match the start/end of a line rather than the whole string. – Barmar Aug 30 '23 at 19:59
  • Why do you expect the prefix like `Start Time:` to be preserved? The regexp matches everything from the beginning of the line to the date, and then replaces all that with an empty string. – Barmar Aug 30 '23 at 20:00
  • What's the reason for putting the date pattern in a capture group? You never reference it in the replacement string. – Barmar Aug 30 '23 at 20:00
  • Don't forget to [use a raw string for regexp](https://stackoverflow.com/questions/12871066/what-exactly-is-a-raw-string-regex-and-how-can-you-use-it) – Barmar Aug 30 '23 at 20:04
  • Trying to do negative matches against multiple options (aside from "not character" `[^...]`) isn't going to work. Instead use positive matches. `((?:Start Time: )|(?:Triggered Time: ))(\d+-\d+-2023 )` (which, assuming you don't have other lines to match against, could be `(?:Time: )(\d+-\d+-2023 )`) – Ouroborus Aug 30 '23 at 20:06

3 Answers3

1

Assuming you know starting parts of the line you want to ignore for substitution, you can match using this regex:

^((?:Captured|1St) .*)|\b\d\d-\d\d-2023\s*

and replace with:

\1

RegEx Demo

RegEx Details:

  • ^: Start
  • ((?:Captured|1St) .*): Match line starting with Captured or 1st followed by a space and everything till end
  • |: OR
  • \b\d\d-\d\d-2023\s*: Match date part
anubhava
  • 761,203
  • 64
  • 569
  • 643
1
print(re.sub(r'(Start Time:|Triggered Time:)\s+[0-9-]+(\s[0-9:]+)',r'\1\2', Input))

Captured on: 08-29-2023 09:43:26
Start Time: 11:11:57
1St Cycle Time: 08-28-2023 11:12:06 

Channel-00 Cycles Completed: 4404 
Channel-00 Triggered Time: 00:45:08 
Channel-01 Cycles Completed: 24 
Channel-01 Triggered Time: 11:15:57
LetzerWille
  • 5,355
  • 4
  • 23
  • 26
1

I have assumed the objective is to remove date strings having the format dd-mm-yyyy when that string is:

  • not in the first line of the string
  • followed by one or more spaces followed by a time in 24-hour format (e.g. 17:35:16), followed by zero or more spaces followed by a line terminator (\n or, for Windows, \r\n)
  • not followed by a blank line

Not that I have made no assumptions about the text that precedes the date strings to be removed, so that the regular expression need not be changed if that text is changed in future.

Under the above assumptions the text matching the following regular expression can be replaced with the contents of capture group 1.

(?<!\A)^(.*) \d{2}-\d{2}-\d{4}(?= +\d{2}:\d{2}:\d{2} *\r?\n(?! *\r?\n))

Demo

The regular expression can be broken down as follows.

(?<!\A)             # negative lookbehind asserts current string position
                    # is not at the beginning of the string
^                   # match the beginning of a line
(.*)                # match zero or more characters other than line terminators,
                    # as many as possible, and save to capture group 1
[ ]                 # match a space
\d{2}-\d{2}-\d{4}   # match the string representing the date
(?=                 # begin a positive lookahead
  [ ]+              #   match one or more spaces
  \d{2}:\d{2}:\d{2} #   match the string representing the time
  [ ]*\r?\n         #   match zero or more spaces followed by a line terminator
  (?! *\r?\n)       #   negative lookahead asserts an empty line does not follow
)                   # end the positive lookahead

In the above I've enclosed most spaces in a character class ([ ]) merely to make them visible.

Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100