5

I am new to regular expressions but I think people here may give me valuable inputs. I am using the logstash grok filter in which I can supply only regular expressions.

I have a string like this

/app/webpf04/sns882A/snsdomain/logs/access.log

I want to use a regular expression to get the sns882A part from the string, which is the substring after the third "/", how can I do that?

I am restricted to regex as grok only accepts regex. Is it possible to use regex for this?

baudsp
  • 4,076
  • 1
  • 17
  • 35
flyasfish
  • 119
  • 2
  • 3
  • 11

5 Answers5

6

Yes you can use regular expression to get what you want via grok:

/[^/]+/[^/]+/(?<field1>[^/]+)/
CWoods
  • 592
  • 6
  • 13
  • I know this answer is way too late, but +1 anyway for being the first *correct* answer That is, a standalone regex (no other code and no delimiters) that uses named capture for the parts it's supposed to extract. – Alan Moore Mar 22 '14 at 05:12
2

for your regex:

    /\w*\/\w*\/(\w*)\/

You can also test with: http://www.regextester.com/

By googling regex tester, you can have different UI.

junky
  • 336
  • 2
  • 7
  • From http://www.regextester.com/ it gives me no match, I tried http://gskinner.com/RegExr/ no result there as well... – flyasfish Nov 23 '12 at 05:34
  • This solution relies on directory and file names always consisting of alphanumeric characters or underscores. In particular there may be no spaces anywhere in the path – Borodin Nov 23 '12 at 05:39
  • the match is index 0 based. You can also see: 1: (sns882A), which means its the first match. – junky Nov 23 '12 at 05:53
  • When using /\w*\/\w*\/(\w*)\/ for grok filter, got grok parse failure error maybe because no match found. – flyasfish Nov 23 '12 at 06:19
0

This is how I would do it in Perl:

my ($name) = ($fullname =~ m{^(?:/.*?){2}/(.*?)/});

EDIT: If your framework does not support Perl-ish non-grouping groups (?:xyz), this regex should work instead:

^/.*?/.*?/(.*?)/

If you are concerned about performance of .*?, this works as well:

^/[^/]+/[^/]+/([^/]+)/

One more note: All of regexes above will match string /app/webpf04/sns882A/.

But matching string is completely different from first matching group, which is sns882A in all three cases.

mvp
  • 111,019
  • 13
  • 122
  • 148
0

If you are indeed using Perl then you should use the File::Spec module like this

use strict;
use warnings;

use File::Spec;

my $path = '/app/webpf04/sns882A/snsdomain/logs/access.log';
my @path = File::Spec->splitdir($path);

print $path[3], "\n";

output

sns882A
Borodin
  • 126,100
  • 9
  • 70
  • 144
  • I can not use any languages, this is part of the logstash-grok configuration in which I can only supply expressions. – flyasfish Nov 23 '12 at 05:44
0

Same answer but a small bug fix. If you doesnt specify ^ in starting,it will go for the next match(try longer paths adding more / for input.). To fix it just add ^ in the starting like this. ^ means starting of the input line. finally group1 is your answer.

^/[^/]+/[^/]+/([^/]+)/

If you are using any URI paths use below.(it will handle path aswell as URI).

^.*?/[^/]+/[^/]+/([^/]+)/