1

I'd like to extract hash values with sed from strings like this one:

[129.173.213.225:52196] some_text: another_text --> eb1d94daa7e0344597e756a1fb6e7054

The desired result is just have the 32-byte hash value, eb1d94daa7e0344597e756a1fb6e7054 in this example.

So I tried the following command with regexp (idea is to remove all except the matched pattern):

% sed 's/\([0-9a-f]{32}\)$/\1/g' < file

which results in error:

sed: -e expression #1, char 20: invalid reference \1 on `s' command's RHS

Passing -r to sed didn't help, I'm getting the same error.

I think the regexp itself is correct because sed -r 's/[0-9a-f]{32}$/XXX/g' does replace the 32-byte hash value with XXX string. But I want to remove all of the string except the hash value.

% cat /etc/issue
Ubuntu 16.04.6 LTS \n \l

%
% ls -la `which sed`
-rwxr-xr-x 1 root root 73424 Feb 11  2016 /bin/sed
%

What am I doing wrong?

Mark
  • 6,052
  • 8
  • 61
  • 129
  • 1
    even though the command wouldn't work, `sed 's/\([0-9a-f]{32}\)$/\1/g'` should not result in an error unless you have `sed` aliased to `sed -r` or `sed -E`... for default BRE, you need to use `\{\}` for quantifiers.. with ERE, both `()` and `{}` will work as metacharacters – Sundeep Oct 16 '19 at 14:58
  • 1
    for your problem, I'd rather use `grep -o '[0-9a-f]\{32\}$'` instead of sed – Sundeep Oct 16 '19 at 14:59
  • @Sundeep, thanks for feedback. Command `sed 's/\([0-9a-f]{32}\)$/\1/g'` didn't work for me, i.e. it doesn't fail but doesn't extract hash value, it prints the original line. – Mark Oct 16 '19 at 15:07
  • 1
    when you do `echo 'foo and bar' | sed 's/and/1/'`, only `and` is replaced and rest of the line remains as is.. so, if you want to remove all characters before the hex value, you need to tell sed to match them... `sed 's/.*\([0-9a-f]\{32\}\)$/\1/'` (also note the use of `\{\}` and that `g` is not needed as this can only match once) – Sundeep Oct 16 '19 at 15:10
  • @Sundeep, I see -- so we match on the whole input line, _but_ exclude part of the expression using "(" and ")" . – Mark Oct 16 '19 at 15:32

2 Answers2

1

You can try to substitute everything up to the --> part of the line with an empty string: i.e. substitute what matches .* -->.

So the command would be:

sed 's/.* -->//g' < file

Even if some_text contains the arrow symbol, we are using the greedy regex, which will match as long as it can.

The result of the example line follows:

«~» $ sed 's/.* -->//g' < testregex 
 eb1d94daa7e0344597e756a1fb6e7054
Naslausky
  • 3,443
  • 1
  • 14
  • 24
0

You can use awk for this as well

awk '/-->/ {print $NF}' file

It will find lines with --> then print last field.

Jotne
  • 40,548
  • 12
  • 51
  • 55