I have a pdf file with its contents formatted as follows:
00:12 There once lived a man...
00:18 who was thought to have...
and the list goes on following the same pattern. Now I'm trying to write a Regex program that will read the file and remove all of the time stamps as well as replace the line skips with spaces. In other words. I want to make one big paragraph out of it.
This is what I came up for the reg expression:
transcript.replace(transcript.matches("^[0-9:]+$"),"")
and that will get rid of any numbers and colons, meaning the time stamps. Now I'm not sure how to replace the line skips, would I do something like
transcript.replace(transcript.matches("^[\n]+$"), " ")
Any help would be appreciated. Thanks!