How do I keep the first n lines of a file/command, but grep the rest?

Question

Easiest to give an example.

bash-$ psql -c 'select relname, reltype from pg_catalog.pg_class limit 5;


        relname         | reltype
------------------------+---------
 bme_reltag_02          |       0
 bme_reltag_type1_type2 |       0
 bme_reltag_10          |       0
 bme_reltag_11          |       0
 bme_reltag_cvalue3     |       0   what I care about

But what I am really interested in is anything with cvalue in it. Rather than modifying each query by hand (yes, I know I could do it), I can egrep what I care about.

psql -c 'select relname, reltype from pg_catalog.pg_class limit 5;' | egrep 'cvalue'

but that strips out the first two lines with the column headers.

 bme_reltag_cvalue3     |       0

I know I can also do this:

psql -c 'select relname, reltype from pg_catalog.pg_class limit 5;' | head -2 && psql -c 'select relname, reltype from pg_catalog.pg_class limit 5;' | egrep 'cvalue'

        relname         | reltype
------------------------+---------
 bme_reltag_cvalue3     |       0

but what I really want to do is to keep the head (or tail) of some lines one way and then process the rest another.

My particular use case here is grepping the contents of arbitrary psql selects, but I'm curious as to what bash capabilities are in this domain.

I've done this before by writing to a temp file and then processing the temp file in multiple steps, but that's not what I am looking for.

Might be an ignorant question, but can you `grep` first and then pipe that to `psql`? — John Dibling, May 21 '20 at 23:20
I think whatever you are doing is sufficiently complex that you should move it out of a shell script yesterday. rewrite in python (https://google.github.io/styleguide/shellguide.html#s1.2-when-to-use-shell). — Z4-tier, May 21 '20 at 23:26
@Z4-tier this is a question about what bash allows doing, not about whether or not I should write Python. txs. actually, the funny thing is that I am using this to assist writing Python scripts by looking at some of the data in a database from the command line. — JL Peyret, May 21 '20 at 23:29
@JLPeyret which is why i posted as a comment and not an answer :) Maybe you can add this as a function call in the python script? — Z4-tier, May 21 '20 at 23:32
@Z4-tier fair enough, but really my question is limited to what bash can do here by itself. the only thing I am aware of that vaguely plays in this area is `tee`. I suspect that it is not possible to treat some rows one way and some another without an intermediate temp file, but I have been surprised by what bash can do before. — JL Peyret, May 21 '20 at 23:36
If you really want to do it this way, why is it a problem to make the database do the filtering? Just add `WHERE relname LIKE "%cvalue%"` to the query? — Z4-tier, May 21 '20 at 23:41

Jetchisel · Answer 1 · 2020-05-22T00:39:19.210

A while read loop and grep, if that is acceptable.

#!/usr/bin/env bash

while IFS= read -r lines; do
  [[ $lines == [12]* ]] && echo "${lines#*:}"
  [[ $lines == *cvalue[0-9]* ]] && echo "${lines#*:}"
done < <(psql -c 'select relname, reltype from pg_catalog.pg_class limit 5;' | grep -n .)

Without the grep an alternative is a counter to know the line number, which will be a pure bash solution.

#!/usr/bin/env bash

counter=1
while IFS= read -r lines; do
  [[ $counter == [12] ]] && echo "$lines"
  [[ $lines == *cvalue[0-9]* ]] && echo "$lines"
  ((counter++))
done < <(psql -c 'select relname, reltype from pg_catalog.pg_class limit 5;')

If bash4+ is available.

#!/usr/bin/env bash

mapfile -t files < <(psql -c 'select relname, reltype from pg_catalog.pg_class limit 5;')

printf '%s\n' "${files[0]}" "${files[1]}"

unset 'files[0]'  'files[1]'

for file in "${files[@]}"; do
  [[ $file == *cvalue[0-9]* ]] && echo "$file"
done

By default the builtin read strips the leading and trailing white spaces, so in this case we don't want that, so we use IFS=
grep -n . adds the line number with a :
[12] is a glob not regex which means either 1 or 2 and the glob * will match if it is the first character of the line.
*cvalue[0-9]* will match cvalue and any amount of int/digit next to it.
"${lines#*:}" is a parameter expansion that strips the leading :
<( ) is called process substitution.

score 3 · Answer 2 · answered May 22 '20 at 01:03

3

$ psql -c ... | awk 'NR<3 || /cvalue/' file

answered May 22 '20 at 01:03

karakfa

66,216
7
41
56

nice. should I read this as *keep (print) anything in the first 3 lines `or` matching *? – JL Peyret May 22 '20 at 01:21
print first two lines or matching pattern. First two lines will be printed regardless of the match. – karakfa May 22 '20 at 03:40

SiegeX · Accepted Answer · 2020-05-22T00:49:01.430

2

This can be done with sed using its range feature to only operate on lines 3 and beyond

sed '3,${/cvalue/!{d;};}'

Proof of Concept

$ cat ./psql
relname                | reltype
------------------------+---------
bme_reltag_02          |       0
bme_reltag_type1_type2 |       0
bme_reltag_10          |       0
bme_reltag_11          |       0
bme_reltag_cvalue3     |       0

$ sed '3,${/cvalue/!{d;};}' ./psql
relname                | reltype
------------------------+---------
bme_reltag_cvalue3     |       0

Explanation

3,${...;}: Start processing from line 3 until the end of file $
/cvalue/!{d;}: Delete d any line that does not match (!) the regex /cvalue/

edited May 22 '20 at 00:49

answered May 22 '20 at 00:28

SiegeX

135,741
24
144
154

this looks verrry close. but can it work in a pipe? when I `cat psql | sed '3,${/cvalue/!{d}}/'` I get *sed: 1: "3,${/cvalue/!{d}}": extra characters at the end of d command&*. In fact, I also get it using your exact syntax, might be a macox vs Linux thing (the reason I used `egrep` rather than `grep` in my example). – JL Peyret May 22 '20 at 00:42
Just a note that this not seem to work with BSD sed, at least on this side. – Jetchisel May 22 '20 at 00:44
looks promising. maybe finally a reason to look at sed, for which I generally follow Z4-tier and just used Python instead. – JL Peyret May 22 '20 at 00:45
@JLPeyret Does this work on OSX/BSD sed? `sed '3,${/cvalue/!{d;};}'` Note the extra semi-colons `;` – SiegeX May 22 '20 at 00:45
@Jetchisel yeah, but it's not the pipe issue after all, it's that sed OS difference most likely. – JL Peyret May 22 '20 at 00:46
@JLPeyret, for a quick text processing i'd go with `sed` or `awk` instead of writing a python or bash script. – Jetchisel May 22 '20 at 00:48
@SiegeX yup, updated command works. `sed` is out of the dog house and back into my toolkit. can you update your answer with a footer re. alternative bsd syntax? no need to do it on the main, just at the end is fine. – JL Peyret May 22 '20 at 00:48
@JLPeyret Awesome, good to know. Thanks to this [Stack Exchage](https://unix.stackexchange.com/questions/13711/differences-between-sed-on-mac-osx-and-other-standard-sed) post on the diff between GNU and BSD `sed` I noticed it said that GNU `sed` allows `;` to be omitted before a `}` whereas BSD `sed` does not. Will remember that going forward – SiegeX May 22 '20 at 00:51
@JLPeyret I updated the main, it works with GNU sed too, it's just extra syntax. Oddly enough adding `--posix` as an option to GNU `sed` does ***not*** catch this. It still gives me the same output. – SiegeX May 22 '20 at 00:53
@Jetchisel I've been using https://github.com/chmln/sd a bit. well, to tell you the truth, `awk` and `sed` are bit over my paygrade when it comes to unix/bash skills. at that time, it's tempting to just Python away. I was very Python for this type of tasks on Windows, but have slowly learned to rely more on pure bash on Linux/macos. this fits well in bash because I can nest it in a function. – JL Peyret May 22 '20 at 00:54
@JLPeyret sure use what is available for you , but in the end `sed` and `awk` are some of the default utilities on a Unix like environment and it's inevitable :-) – Jetchisel May 22 '20 at 00:58
@JLPeyret I would most definitely spend some brain cells to learn `awk`, that is my go-to. The only time I use `sed` is because it can be more terse than `awk` at times (like here.) Even so, your problem can be solved with this slightly longer `awk` command: `awk 'NR<3{print;next} /cvalue/{print}'` and you can actually remove the last `{print}` after `/cvalue/` because printing is `awk`'s default action (still, `sed` is slightly shorter.) – SiegeX May 22 '20 at 01:02
@SiegeX, I think `awk` solution is shorter. – karakfa May 22 '20 at 03:42

score 0 · Answer 4 · answered May 21 '20 at 23:26

0

You can use bash.. tail.and head commands

cat file.sql | head -n 15 > head.sql

Replace the 15 with the number of lines

Or replace head with tail... for the bottom of the file

answered May 21 '20 at 23:26

Danny Ebbers

919
5
11

+1. yes, agree. I can put the output in a temp file and then proceed in separate steps. I've edited the question to reflect that this is not being considered., – JL Peyret May 21 '20 at 23:32
Not sure if this helps.. but you can chain multiple pipes – Danny Ebbers May 21 '20 at 23:34
`echo foo | grep xxx | grep yyy`? yes, but that wouldn't work here. – JL Peyret May 21 '20 at 23:37
See this one https://stackoverflow.com/questions/8624669/unix-head-and-tail-of-file various ways of nesting/chaining without an extra file – Danny Ebbers May 21 '20 at 23:38
no need to chain pipes like this, a single `grep` can take multiple patterns: `grep -e [pattern 1] -e [pattern 2] ... [file]` – Z4-tier May 21 '20 at 23:43

How do I keep the first n lines of a file/command, but grep the rest?

4 Answers4

Proof of Concept

Explanation