1

I have a multiples files in a remote server. I got grep/sed/awk the files but I have not been able to obtain context lines, before and after the match for each file and then append in one file.

With sed I have achieved from pattern to the end of file got the lines but no before pattern. With awk I got the line number of the match for each file but got some errors with find and -exec. I am beginner in Linux and regex. What am I doing wrong?

First attempt:

sshpass -p password ssh user@server "find /data/ -name "*.txt" -type f  -exec ksh -c "grep -n $KEY $1 | cut -d':' -f1  | xargs -n1 -I% awk 'NR<=%+5 && NR>=%-5' $1 " ksh {} \; -print" > output.txt

It seems works fine until xargs command. I got this error: find: 0652-018 An expression term lacks a required parameter.

Second attempt:

sshpass -p password ssh user@server "find /data/ -name "*.txt" -type f -exec grep -n $KEY {} \; | cut -d':' -f1  | xargs -n1 -I% -exec awk 'NR<=%+5 && NR>=%-5' {} \; -print" > output.txt

I got:

awk: 0602-533 Cannot find or open file {}. The source line number is 1. awk: 0602-533 Cannot find or open file {}. The source line number is 1. awk: 0602-533 Cannot find or open file {}. The source line number is 1.

Third attempt:

sshpass -p password ssh user@server "find /data/ -name "*.txt" -type f  -exec sed -n '/$KEY/,$ p' {} \;" > output.txt

With sed seems works fine with simple words and I can obtain lines from patterns to the end of each file. But I can't get expressions like this "word1.*word2" (words in same line) works.

$KEY is my variable with the pattern to match.

ouflak
  • 2,458
  • 10
  • 44
  • 49
Benedix
  • 11
  • 2
  • 1
    Your quoting is bad and unnecessarily complex. You could simplify it by putting the commands in a script and copying it to the host. Without access to a system with the expected files, it's hard to post an answer with adequate testing. – tripleee Jan 14 '22 at 18:41

1 Answers1

1

Your second attempt failed because you appear to misunderstand what runs where. The -exec ends at \; and that's where the remote command ends, too; the rest of the pipeline runs on your local computer.

So in a way, the first attempt was closer, but the quoting there was wrong, and using grep to find line numbers just so you can pass them back to Awk is weird and inefficient; perhaps see also Counting lines or enumerating line numbers so I can loop over them - why is this an anti-pattern?

Because you want to use both double and single quotes, perhaps the least annoying solution is to pass the script to ssh an a here document. (You still can't nest double quotes, though, and if you need variable interpolation in the here document, you will need to escape any dollar signs or backticks which should not be evaluated by your local shell.) See also What is the cleanest way to ssh and run multiple commands in Bash?

ssh user@server <<___EOF >output.txt
    find /data/ -name "*.txt" -type f  -exec \
      awk -v key="$KEY" '
        \$0 ~ key { p=5; if(q) for (i=0; i<=5 && i<=q; ++i) print lines[((q+i)%6)+1]; q=0 }
        !p { lines[++q%6] = \$0 }
        p && p--' {} \;
____EOF

If you can't use a here document for some reason, the alternative is unfortunately rather depressing.

ssh user@server \
    find /data/ -name '"*.txt"' -type f  -exec \
      awk -v key="'$KEY'" '"\
        \$0 ~ key { p=5; if(q) for (i=0; i<=5 && i<=q; ++i) print lines[((q+i)%6)+1]; q=0 }
        !p { lines[++q%6] = \$0 }
        p && p--"' {} '\;' >output.txt

The weird double quoting is because ssh eats one level of quotes. We use one (outer) layer of quotes (single quotes where it makes sense, otherwise double) to quote expressions from the local shell, and another level (generally double) to still have quotes in the remote shell ... and then you still need to backslash the dollar sign in $0.

The Awk script attempts to keep a memory of recent lines it has seen in lines so that it can recall them and print them when it finds a match on $KEY. This is probably a duplicate of an existing question (and then the duplicate is probably better tested than this one; not in a place where I can properly check the corner cases); see e.g. How to print 5 lines before and after the match regex with awk command

If your find supports it, replacing {} \; with {} + will improve efficiency by passing multiple files to Awk in one go.

Incidentally, none of your shell scripts contain any syntax specific to ksh so (unless the sh in AIX is severely broken) you might replace those with sh.

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • Hi, tripleee! I really appreciate your time in replying. About my second attempt the problem was that i couldn't get awk placeholder to work. I don't know how to handle more than one placeholder with find through ssh. Right now i I'm a little confused, but I will check the links that you have been sent me and read more about. Do you know about a good document or web page to better understad grep, awk and sed for AIX? AIX is what I work with. I will also check your code. I know a little about a here document and a Limit String what is. But, as a first glance, should I still use sshpass? – Benedix Jan 16 '22 at 08:14
  • What I need is a script that someone can take and look up information (as simple as "script whatIsearch") in multiples files in a remote server, and append all context lines for each file into one output file. Thank you very much! – Benedix Jan 16 '22 at 08:15
  • Sounds like you have two new questions which should be posted as separate questions instead, perhaps with a link to this one. – tripleee Jan 16 '22 at 08:24
  • The first one seems to be about passing the same file multiple times to `find`, is that correct? (But why would you need that? Anyway, something like `find blah blah -exec sh -c 'awk "stuff" "$1" "$1"' sh {} \;` would work if your `find` doesn't simply let you put `{}` twice) and the `sshpass` one is unclear for other reasons - if `sshpass` prevents you from using a here document, I already provided an alternative with a different mechanism (but it doesn't read the password from standard input, does it?) – tripleee Jan 16 '22 at 08:24
  • I talk about multiples files. `Find` takes multiples files, that is, i need to look up the pattern in all this files. I will try with `sh -c`. First I want to try to get it to work by passing the password with sshpass and at another time the user should type it. – Benedix Jan 16 '22 at 08:38
  • Modern `find` lets you pass multiple files to `-exec` by using `+` instead of `\;` but I avoided that because AIX is probably too old to support this. – tripleee Jan 16 '22 at 08:51
  • I'm old enough to have experienced AIX (and SunOS, and whatever DEC's was called, and - shudder - HP-UX) before Linux became acceptable to the enrterprise and upped the userland standards significantly. These days, forcing someone to use userland utilities which last received any love from a maintainer in the early 1990s is just cruel. – tripleee Jan 16 '22 at 08:57
  • A workaround is to use `find | xargs awk` but this runs into trouble with irregular file names (again, GNU `find` and `xargs` has a siginificant improvement in this area); perhaps see http://mywiki.wooledge.org/BashFAQ/020 for a fuller treatment. – tripleee Jan 16 '22 at 09:08
  • In fact https://www.ibm.com/docs/en/aix/7.2?topic=f-find-command suggests that `-exec cmd {} +` is actually supported, though the feature set is otherwise predictably barren. – tripleee Jan 16 '22 at 09:18