0

I would like to parse some syslog lines that they look like

Oct 20 16:34:59 artguard TTN-xxxxxxxxxxxxxxxxxxxxxxxxxxxxx

I would like to turn them into

TTN-xxxxxxxxxxxxxxxxxxxxxxxxxxxxx

So I was wondering how the regular expression should look like that would allow me to do so, since the first part will change every day, because it is appended by the syslog.

EDIT: to avoid duplicated, I am trying to use REGEX with filebeat, where no all regex are supported as explained here

Cœur
  • 37,241
  • 25
  • 195
  • 267
ndarkness
  • 1,011
  • 2
  • 16
  • 36
  • if `TTN-` is always there just use that as your anchor in search. `TTN-.*$` – abc123 Oct 20 '16 at 15:07
  • Possible duplicate of [Learning Regular Expressions](http://stackoverflow.com/questions/4736/learning-regular-expressions) – Biffen Oct 20 '16 at 16:16
  • @abc123 then the first part, wouldn´t the date plus time before the TTN- affect? – ndarkness Oct 20 '16 at 18:53
  • @ndarkness you don't have to match a whole string with regex, in the above i'm just searching for the first instance for `TTN- ` than any character any number of times including 0 until the end of line. – abc123 Oct 20 '16 at 18:56
  • @abc123 ok thanks! But then, the date will be included as well as part of the whole matched string, right? – ndarkness Oct 20 '16 at 19:00
  • @ndarkness, No since we are not searching for it in the regex, however you can also just gather specific things by using capture groups `()` for example `TTN-(.*$)` this will make it so that the capture group 1 contains everything after TTN-. It will not contain anything else. – abc123 Oct 20 '16 at 19:21
  • @abc123 thanks it does work perfectly, `TTN-.*$` – ndarkness Oct 21 '16 at 06:57

2 Answers2

1

The regular expression TTN-\S* is probably a way of doing what you're looking for, here it is in a java-script example.

 var value = "Oct 20 16:34:59 artguard TTN-xxxxxxxxxxxxxxxxxxxxxxxxxxxxx";
var matches = value.match(
     new RegExp("TTN-\\S*", "gi")
);
document.writeln(matches);

It works in two main parts:

  1. The TTN- matches TTN- (obviously)
  2. The \S* matches any character that is not a white-space, this is done as many times as possible.

Currently it is always expecting atleas a '-' after the TTN but if you repace the '-' with a '-{01}' in the regex it will expect TNN maybe a dash followed by 0-n characters that are not a white-space. You could also replace \S* with \w* to get all the letters and digits or .* to get all characters apart from end of line /n character, TNN-\S*[^\s{2}] too end the match with two spaces. Hope this was helpful.

milo.farrell
  • 662
  • 6
  • 19
  • thanks, the dash would be always present after TTN string, however the strig xxxxxxxxxxxxxxxxxxxxxxxxx will contain for sure white spaces, will they make fail the regex? – ndarkness Oct 20 '16 at 18:57
  • if my filebeat regex doesn´t support `\S`as regex then I cannot use your solution,right? – ndarkness Oct 20 '16 at 19:08
  • Yes, is there something that will signal the end of the xxxxxx e.g a double space, end of file, end of line. For example replace \s* with .* to match all bar /n , newline. – milo.farrell Oct 20 '16 at 19:15
1

Regex101

(TTN-.*$)

Regular expression visualization

Debuggex Demo

Explained

1st Capturing Group (TTN-.*$)
    TTN- matches the characters TTN- literally (case sensitive)
    .* matches any character (except for line terminators)
        * Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
    $ asserts position at the end of a line
Global pattern flags
    g modifier: global. All matches (don't return after first match)
    m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
abc123
  • 17,855
  • 7
  • 52
  • 82