2

I am looking for a regex that collects everything between the last colon (:) and the first space before it. So if I have the sentence "HUHweg: 4ihfw:euigr yalalala:" I want to collect "yalalala".

I came up with " (?<content>)/[^:]*$" but the content group is empty. I expect that a space followed by the content untill the last : from the end ($) would work, but it does not. How can it be made to work?

user2609980
  • 10,264
  • 15
  • 74
  • 143

2 Answers2

3

You're not so far... Try using (here I use ~ as regex delimiters)

~ (?<content>[^ :]*):\s*$~

The capturing group has to have, well, something to capture. Also you need to forbid spaces in the content, or you might catch some words before the part you want.

The \s, a shortcut for all kind of whitespaces (spaces, tabs, newline...) will catch the case where there are trailing whitespaces after the last column.

See demo

Robin
  • 9,415
  • 3
  • 34
  • 45
  • Still returns nothing. If I remove the $ it returns everything before the first : until the space. – user2609980 Apr 17 '14 at 12:37
  • To be more precise: I have: `"2014-04-13 01:32:36.246 Request ThisIsAMethod: "` and I want to collect `ThisIsAMethod`. – user2609980 Apr 17 '14 at 12:38
  • @user2609980 Yep, your issue is the trailing whitespace... `$` matches the end of the string, so the regex can't match indeed. Is this all your string, or are there more one newlines after? – Robin Apr 17 '14 at 12:40
  • Great! It was the trailing whitespace indeed. If I do `(?[^ :]*) :$` it works. :-) Thanksss! – user2609980 Apr 17 '14 at 12:44
  • @user2609980: I edited to add a more general answer. Glad I could help! – Robin Apr 17 '14 at 12:49
0

PHP code (I think that it can be easily converted into other language)

<?php
$patterns=array('/(.|\s)*(\s)((.[^:])*)(\:)(?=[^\:]*)/');
$replacements=array(' $3 ');
$string='HUHweg: 4ihfw:euigr yalalala: ';

echo preg_replace($patterns,$replacements,$string);
?>

Result: yalalala

First comes any amount of any char or white char. Then comes exactly one white space. Then our string, which is captured as $3. Next thing is colon, and then statement which matches any char but not colon :)

user3162968
  • 1,016
  • 1
  • 9
  • 16
  • 1
    No, this is completely wrong. `(.|\s)` is the worst possible way to match any-character-including-newline (explained [here](http://stackoverflow.com/a/2408599/20938)); `(?=[^\:]*)` doesn't mean anything ("followed by zero or more of anything" is always true); and there's nothing in there that requires a space to be matched. – Alan Moore Apr 17 '14 at 12:40
  • I added space that I was writing about. ?=[^\:]* is 0 or more of signs other than space. I do not know any better way to match any sign including spaces and/or newlines other than (.|\s)* but maybe you can advice something?:)) in my case (.|\n) has never worked as expected – user3162968 Apr 17 '14 at 13:41
  • **1.** `(?=[^:]*)` means it is possible to match zero or more characters other than colon; it will succeed even if the very next character is a colon. If you want to say *there are no more colons after this*, you have to anchor it: `(?=[^:]*$)`. **2.** The best way to match any-character-including-newline in most flavors is to use `.` and set the single-line/DOTALL flag. In PHP and .NET you can do that by adding `(?s)` to the beginning of the regex. **3.** `\s` matches any whitespace character; to match a space you should use a space. – Alan Moore Apr 17 '14 at 15:37