1

I am trying to take the transcription text from a Youtube video and parse it out into a document for editing. I have been able to remove the majority of the HTML markup that is placed in it. However, I would like to remove the code below, which is the timestamp and offset that has already been parsed into a single string.

I've tried this but I am no good at regexes:

/^\d{2}\:\d{8}\"\>$/gm

In Regex101 tester (https://regex101.com/r/a9wi2j/3/), it works but in EditPad replace, it does not.

What regex in EditPad would remove all the lines ending with below ">?

03:17197850">
that so if you have production staff you
03:21201780">
can create logins like that for them and
03:24204299">
then they have access to all the
03:25205739">
information and everything they need but
03:27207359">
they can't go in and adjust pricing on
03:30210060">
MB34
  • 4,210
  • 12
  • 59
  • 110

1 Answers1

0

You need to remove backslash before > because in earlier versions of EditPad \> had been interpreted as a word boundary and recent versions don't support this token.

You need to enable m flag too as you did in provided demo:

(?m)^\d{2}\:\d{8}\">$
revo
  • 47,783
  • 14
  • 74
  • 117
  • `(?m)^\d*\:\d*\">$` actually works better because the seconds + offset may be more than 8 digits. I know the minutes is always 2 digits but using the * helps if the minutes goes over 99. I don't know how the time and offsets look for videos > 60 mins. – MB34 Mar 14 '18 at 15:30