1

I have a text file that contains quotes, comma and spaces.

"'x','a b c'"
"'x','a b c','1','2 3'"
"'x','a b c','22'"
"'x','a b z'"
"'x','s d 2'"

However, when I try using grep to pull the exact match, it doesn't display the results. Below is the command I'm trying to use.

grep -E "\"\'x\'\,\'a\s\+b\s\+c\'\"" test.txt

Expected output: "'x','a b c'"

Am I missing anything? Any help would be really appreciated.

Venkat
  • 41
  • 1
  • 7

1 Answers1

1

You were close! Couple of notes:

  • Don't use \s. It is a gnu extension, not available everywhere. It's better to use character classes [[:space:]], or really just match a space.
  • The \+ may be misleading - in -E mode, it matches a literal +, while without -E the \+ matches one or more preceding characters. The escaping depends on the mode you are using.
  • You don't need to escape everything! When in " doublequotes, escape doublequotes "\"", don't escape singlequotes and commas in doublequotes, "\'\," is interpreted as just "',".

If you meant only to match spaces with grep -E:

grep -E "\"'x','a +b +c'\""

This is simple enough without -E, just \+ instead of +:

grep "\"'x','a \+b \+c'\""

I like to put things in front of + inside braces, helps me read:

grep "\"'x','a[ ]\+b[ ]\+c'\""
grep -E "\"'x','a[ ]+b[ ]+c'\""

If you want to match spaces and tabs between a and b, you can insert a literal tab character inside [ ] with $'\t':

grep "\"'x','a[ "$'\t'"]\+b[ "$'\t'"]\+c'\""
grep -E "\"'x','a[ "$'\t'"]+b[ "$'\t'"]+c'\""

But with grep -P that would just become:

grep -P "\"'x','a[ \t]+b[ \t]+c'\""

But the best is to forget about \s and use character classes [[:space:]]:

grep "\"'x','a[[:space:]]\+b[[:space:]]\+c'\""
grep -E "\"'x','a[[:space:]]+b[[:space:]]+c'\""
KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • Why do you suggest not to use `\s` if OP might in fact only use the regex with a GNU grep? It is not clear if OP has a GNU or any other grep, but I came to the conclusion that it makes no sense trying to come up with "generic" solutions for *NIX tools as there is always at least one environment where the solution does not work. If `\s` works for OP (same as for me) we can use it, why not? Besides, `grep -E "\"'x','a +b +c'\""` is not valid syntax in my Bash (`GNU bash, version 4.3.48(1)-release (x86_64-pc-linux-gnu)`). Do you have an idea why? – Wiktor Stribiżew Oct 30 '19 at 09:28
  • 1
    That is a valid point. Just tested and on alpine with busybox `\s` matches a space. `"\"'x','a +b +c'\""` - I just tested on my old machine with bash4.2.24(1) with GNUgrep2.13 and `grep -E "\"'x','a +b +c'\"" <<<"\"'x','a b c'\""` works. No idea then. – KamilCuk Oct 30 '19 at 10:42