1

I have a log file with these contents:

Log Started
Created Date: YY/MM/DD Time: HH:MM:SS Start
Added resources at module on YY/MM/DD HH:MM
Module 2 excecute
Resource depleted at HH:MM n pieces
Open YY/MM/DD HH:MM:SS Log to refer
Target end date of new resource YY/MM/DD approved
Log Ended. Result OK

Legend:

  • 'YY/MM/DD' & 'HH:MM:SS' - these are the unimportant timestamps
  • Everything else - there are the important data

Notes:

  • As you can see, the date and time can both be in the same line, located anywhere within the line, and the time can be HH:MM or HH:MM:SS.
  • Some lines can have no date/time stamps at all

I currently have the regex below, but it can only capture the date and time stamps on each line:

(\d{2}(\d{2})?\/\d{2}\/\d{2}(\d{2})?)|(\d{2}:\d{2}(:\d{2})?)

But, I need to be able to capture the whole line, and place each important and unimportant sections of data in a group.

slimjourney
  • 99
  • 10

2 Answers2

1

I'm not sure how one would do this in C# (as I've never used it before), but here are the regexes I would use:

1. Get the whole line

(.*)\n

Basically just matches everything until it finds a new line char \n.

2. Get the non-timestamps on every line

Duplicate the log file into a new temporary variable and remove all the timestamps and split the resulting string by the newline chars \n DEMO

3. Get the unimportant timestamps

(\d{2}(\d{2})?\/\d{2}\/\d{2}(\d{2})?)|(\d{2}:\d{2}(:\d{2})?)

Your regex was quite good and worked like a charm :)

Ethan
  • 4,295
  • 4
  • 25
  • 44
  • Also, do try regex101's code generator feature, it might be helpful for your case. – Ethan May 31 '17 at 04:04
  • Thanks a lot for the very concise suggestions, but how would I change the regex if the dashes are actual string? Ive only replaced the important data with dashes for illustration purposes though, so in reality they are sentences, phrases, numbers, etc. – slimjourney May 31 '17 at 04:29
  • @slimjourney Oh, that would have been good to know, perhaps you can update your question with sample data and new requirements? Anyways, I'll see if I can find a better solution for that. – Ethan May 31 '17 at 04:31
  • Yeah Im sorry about that, updated the above sample lines accordingly. P.S. the sample data I replaced above are just some placeholders since the log couldnt be shared, but hopefully it gives you the idea. – slimjourney May 31 '17 at 04:43
  • @slimjourney Ok, thanks for the sample data. I've updated my answer (just step 2), can you check again? P.S. you may need to update your question where you say: _'-' these are the important data_ – Ethan May 31 '17 at 05:03
  • Thanks a lot for the quick help, Im now able to slowly incorporate your suggestions into my code. – slimjourney May 31 '17 at 05:30
1

First you have to group each line using (.*)\n. Then you have to replace the date format using the regex from the grouped string.

Check here. https://msdn.microsoft.com/en-us/library/e7f5w83z(v=vs.110).aspx

Senthil
  • 55
  • 8
  • Do you have an idea how I can replace the date/time stamp with a group of dummy string depending on how long the captured group was? For example, if I capture HH:MM:SS Id replace it with 8 "X's", while if I capture "HH:MM" Id only replace it with 5 "X's" – slimjourney May 31 '17 at 05:14
  • @slimjourney This is not possible with pure regex, you'd need to do two different search and replaces, 1st replace HH:MM:SS with 8 "X's", 2nd replace HH:MM with 5 "X's". – Ethan May 31 '17 at 05:18
  • @David I see, Ill have to ask a different question regarding C# replacing a whole string with only 1 character then, again thanks a lot for the help. – slimjourney May 31 '17 at 05:26
  • OK @slimjourney no problem :) Did my answer help you? If so can you [accept it](https://stackoverflow.com/help/someone-answers)? – Ethan May 31 '17 at 05:28
  • my $s = "HH:MM"; $s =~s/(HH\:MM(\:SS)?)/'x' x length($1)/e; print $s."\n"; – Senthil Nov 12 '20 at 17:38