3

Given

str="hij";
sourceStr="abc\nefg\nhij\nlmn\nhij";

I'd like to get the line number of the first occurrence of $str in $sourceStr, which should be 3.

I don't know how to do it. I have tried:

awk 'match($0, v) { print NR; exit }' v=$str <<<$sourceStr
grep -n $str <<< $sourceStr | grep -Eo '^[^:]+';
grep -n $str <<< $sourceStr | cut -f1 -d: | sort -ug
grep -n $str <<< $sourceStr | awk -F: '{ print $1 }' | sort -u

All output 1, not 3.

How can I get the line number of $str in $sourceStr?

Thanks!

anubhava
  • 761,203
  • 64
  • 569
  • 643
Martin
  • 155
  • 10
  • 2
    You have created a variable that contains one line that includes the literal string `\n` multiple times. Did you intend `sourceStr` to be a multiline string? – William Pursell Feb 09 '21 at 17:17
  • 1
    Do you want a string comparison or a regexp comparison? – Ed Morton Feb 09 '21 at 18:22
  • Hi @WilliamPursell, I meant a multiline string. I thought Shell would interpret it as a multiline string. Thanks for pointing it out and giving answers. I learned something new today! – Martin Feb 10 '21 at 18:27
  • Hi @EdMorton, It doesn't matter how I achieve the goal. I just wanted to get the line number of a matching line in the source string. – Martin Feb 10 '21 at 18:28
  • Right, my question is - what is the goal? Is it to do a string comparison or a regexp comparison? For example if `v` is `a.c` should that ONLY match `a.c` in the input (as it would with a string comparison) or should it also match `abc` (as it would with a regexp comparison)? See [how-do-i-find-the-text-that-matches-a-pattern](https://stackoverflow.com/questions/65621325/how-do-i-find-the-text-that-matches-a-pattern) for more information on the different types of comparison. – Ed Morton Feb 10 '21 at 18:32
  • @EdMorton, Oh. I got your question now. My goal was to do an exact match. So it should be a string comparison as you said. (I understood a string comparison is a special case of regex comparison. So a string comparison could be achieved by a regex match. That's why I said it doesn't matter.) Thanks! – Martin Feb 10 '21 at 19:08
  • Ah, I see. No to make a regexp comparison behave as it it were a string comparison is non-trivial (e.g. see [is-it-possible-to-escape-regex-metacharacters-reliably-with-sed](https://stackoverflow.com/questions/29613304/is-it-possible-to-escape-regex-metacharacters-reliably-with-sed)) and pointless when you can just use a tool that supports string comparisons instead. – Ed Morton Feb 10 '21 at 21:04
  • Ah. I may need to take back what I have said after reading your most recent reply and the page you pointed me to. So it should be a regex comparison? For string comparison, I need `\\n` to match the literal `\n`. If I meant `\n` in my `$sourceStr` for **linefeed**, I should say, I need a regex match? – Martin Feb 11 '21 at 16:08

4 Answers4

4

You may use this awk + printf in bash:

awk -v s="$str" '$0 == s {print NR; exit}' <(printf "%b\n" "$sourceStr")

3

Or even this awk without any bash support:

awk -v s="$str" -v source="$sourceStr" 'BEGIN {
split(source, a); for (i=1; i in a; ++i) if (a[i] == s) {print i; exit}}'

3

You may use this sed as well:

sed -n "/^$str$/{=;q;}" <(printf "%b\n" "$sourceStr")

3

Or this grep + cut:

printf "%b\n" "$sourceStr" | grep -nxF -m 1 "$str" | cut -d: -f1

3
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • 1
    Hi @anubhava, Yes. All of them work in the Terminal.app on Mac. When I tried in Keyboard Maestro using the execute a Shell script, only the last one works. I got an error report: syntax error near unexpected token `('. But this is ok. I only need one that works! Thanks for offering so many solutions so that I can choose from! – Martin Feb 10 '21 at 18:41
  • `printf "%b\n" "$sourceStr" | awk -v s="$str" '$0 == s {print NR; exit}'` should also work – anubhava Feb 10 '21 at 18:53
  • 1
    Hi @anubhava, Done! Thanks! – Martin Feb 10 '21 at 19:04
1

It's not clear if you've just made a cut-n-paste error, but your sourceStr is not a multiline string (as demonstrated below). Also, you really need to quote your herestring (also demonstrated below). Perhaps you just want:

$ sourceStr="abc\nefg\nhij\nlmn\nhij"
$ echo "$sourceStr"
abc\nefg\nhij\nlmn\nhij
$ sourceStr=$'abc\nefg\nhij\nlmn\nhij'
$ echo "$sourceStr"
abc
efg
hij
lmn
hij
$ cat <<< $sourceStr 
abc efg hij lmn hij
$ cat <<< "$sourceStr" 
abc
efg
hij
lmn  
hij
$ str=hij
$ awk "/${str}/ {print NR; exit}" <<< "$sourceStr"
3
William Pursell
  • 204,365
  • 48
  • 270
  • 300
  • 3
    `/hij/` will find `hij` in `@hijkl` as well, `==` is safer – anubhava Feb 09 '21 at 17:25
  • @anubhava Agreed, but the OP uses `match($0, v)` so this is just emulating that. But I believe the key issue here is the assignment of the string, and not the awk. – William Pursell Feb 09 '21 at 17:27
  • 2
    Using double quotes around any script or string opens a can of worms for robustness and security. – Ed Morton Feb 09 '21 at 18:20
  • Hi @WilliamPursell, thanks for showing me this. I did not know about this at all. So both of these will output the same result: ``` sourceStr=$"abc\nefg\nhij\nlmn\nhij"; echo $sourceStr sourceStr="abc\nefg\nhij\nlmn\nhij"; echo "$sourceStr" ``` However, this will output `1`: ``` str=hij; sourceStr=$"abc\nefg\nhij\nlmn\nhij"; awk "/${str}/ {print NR; exit}" <<< "$sourceStr" ``` This outputs the desired `3` ``` str=hij; sourceStr=$'abc\nefg\nhij\nlmn\nhij'; awk "/${str}/ {print NR; exit}" <<< "$sourceStr" ``` I changed the double quotes to single quotes in the sourceStr.Why? – Martin Feb 10 '21 at 18:47
  • Sorry. The format for coding is not good. Hitting Enter cannot change lines. I had to use "Shift + Enter" to make new lines. However, they are not reflected in the comment. – Martin Feb 10 '21 at 18:50
1

Just use sed!

printf 'abc\nefg\nhij\nlmn\nhij\n' \
| sed -n '/hij/ { =; q; }'

Explanation: if sed meets a line that contains "hij" (regex /hij/), it prints the line number (the = command) and exits (the q command). Else it doesn't print anything (the -n switch) and goes on with the next line.


[update] Hmmm, sorry, I just noticed your "All output 1, not 3".

The primary reason why your commands don't output 3 is that sourceStr="abc\nefg\nhij\nlmn\nhij" doesn't automagically change your \n into new lines, so it ends up being one single line and that's why your commands always display 1.

If you want a multiline string, here are two solutions with bash:

  • printf -v sourceStr "abc\nefg\nhij\nlmn\nhij"
  • sourceStr=$'abc\nefg\nhij\nlmn\nhij'

And now that your variable contains space characters (new lines), as stated by William Pursell, in order to preserve them, you must enclose your $sourceStr with double quotes:

grep -n "$str" <<< "$sourceStr" | ...
xhienne
  • 5,738
  • 1
  • 15
  • 34
  • Hi @xhienne, Thanks for the detailed explanation. You point out the same issue my code has as @WilliamPurshell has pointed out. I tried your command in Terminal.app, but it is not working. It gives an error code: `sed: 1: "/hij/ { =; q }": extra characters at the end of q command`. I don't see any extra characters, so I don't know what was going on. – Martin Feb 10 '21 at 19:02
  • Hmmm, sorry, I forgot the command separator `;`. Answer updated. – xhienne Feb 10 '21 at 19:04
  • Thanks! It's working now! Sorry, Stackoverflow seems to allow only one checked solution and I have given it to the first reply. But yours is also very helpful. I can only check it as useful. – Martin Feb 10 '21 at 19:14
  • No problem @Martin, the one you accepted is very good (detailed and educational) and I also upvoted it – xhienne Feb 10 '21 at 19:16
0

There's always a hard way to do it:

str="hij";
sourceStr="abc\nefg\nhij\nlmn\nhij";
echo -e $sourceStr | nl | grep $str | head -1 | gawk '{ print $1 }'

or, a bit more efficient:

str="hij";
sourceStr="abc\nefg\nhij\nlmn\nhij";
echo -e $sourceStr | gawk '/'$str/'{ print NR; exit }'
Luuk
  • 12,245
  • 5
  • 22
  • 33
  • 2
    `'/'$str/'...'` is exposing `$str` to the shell for globbing, work splitting, etc. so it's fragile and has security issues. To do that safely would be `'/'"$str"/'...'` but that still has other issues so just never let a shell variable expand to become part of the text of an awk script like that, see https://stackoverflow.com/questions/19075671/how-do-i-use-shell-variables-in-an-awk-script for details. `echo -e $sourceStr` and `grep $str` should be `echo -e "$sourceStr"` and `grep "$str"` too btw - see https://mywiki.wooledge.org/Quotes for when it's safe/appropriate to remove quotes in shell – Ed Morton Feb 10 '21 at 18:36
  • Hi @Luuk, Thanks for your reply. It's strange that why your codes do not require double quotes to interpret the `\n` sign. I know little about the safety issue. However, since @EdMorton mentions it, I think it might better not to vote your answer up before you respond. Thanks again! – Martin Feb 10 '21 at 18:57
  • 1
    The `\n` is interpreted because of the `-e` after echo. (and, yes, Ed is right!) – Luuk Feb 10 '21 at 18:58