gnu sed - delete lines between first X and last Y lines

Question

the goal is to shorten a large text:
delete everything between the first X lines and the last Y lines
and maybe insert a line like "file truncated to XY lines..." in the middle.
i played around and achieved this with weird redirections ( Pipe output to two different commands ), subshells, tee and multiple sed invocations and i wonder if

sed -e '10q'

and

sed -e :a -e '$q;N;11,$D;ba'

can be simplified by merging both into a single sed call.

thanks in advance

Mark Setchell · Answer 1 · 2014-02-21T12:17:35.120

2

Use head and tail:

(head -$X infile; echo Truncated; tail -$Y infile) > outfile

Or awk:

awk -v x=$x -v y=$y '{a[++i]=$0}END{for(j=1;j<=x;j++)print a[j];print "Truncated"; for(j=i-y;j<=i;j++)print a[j]}' yourfile

Or you can use tee like this with process substitution if, as you say, input is coming from a pipe:

yourcommand | tee >(head -$x > p1) | tail -$y > p2 ; cat p[12]

edited Feb 21 '14 at 12:17

answered Feb 21 '14 at 09:06

Mark Setchell

191,897
31
273
432

thanks for your answer, unfortunately, "head" in the first method eats up stream when using stdin (need to pipe), sed and awk methods show line 1-x and line y-$ ; not the last y lines as mentioned in the question. – Nico Rittner Feb 21 '14 at 09:46
Apologies, I hadn't appreciated from the description that it was receiving input from a pipe. – Mark Setchell Feb 21 '14 at 09:50
And the tee part too. – Mark Setchell Feb 21 '14 at 12:17

glenn jackman · Answer 2 · 2014-02-21T15:58:15.987

1

You can do it through a magical incantation of tee, process substitutions, and stdio redirections:

x=5 y=8
seq 20 | { 
    tee >(tail -n $y >&2) \
        >({ head -n $x; echo "..."; } >&2) >/dev/null 
} 2>&1

This version is more sequential and the output should be consistent:

x=5 y=8
seq 20 | {
    { 
        # read and print the first X lines to stderr
        while ((x-- > 0)); do 
            IFS= read -r line 
            echo "$line" 
        done >&2
        echo "..." >&2  
        # send the rest of the stream on stdout
        cat - 
    } |
    # print the last Y lines to stderr, other lines will be discarded
    tail -n $y >&2
} 2>&1

edited Feb 21 '14 at 15:58

answered Feb 21 '14 at 13:44

glenn jackman

238,783
38
220
352

Glenn: I was heading in that direction with my 3rd answer, but was unsure how to guarantee the ordering of outputs from the "head" and "tail". You appear to have merged stdout and stderr to achieve that, but what subtlety that I am missing guarantees the order? – Mark Setchell Feb 21 '14 at 14:00
I don't know if you can guarantee it. Maybe `>(sleep 1; tail -n $y)` would be long enough that the "head" branch would be sure to finish first. The safest way would be to write to a file. Or maybe a fifo would be better... – glenn jackman Feb 21 '14 at 15:08
Cool - I knew there was a reason I went with files :-) – Mark Setchell Feb 21 '14 at 15:41

score 1 · Answer 3 · answered May 19 '15 at 01:15

1

You can also use sed -u 5q (with GNU sed) as an unbuffered alternative to head -n5:

$ seq 99|(sed -u 5q;echo ...;tail -n5)
1
2
3
4
5
...
95
96
97
98
99

answered May 19 '15 at 01:15

nisetama

7,764
1
34
21

Thor · Answer 4 · 2014-03-04T15:13:14.263

Here is a sed alternative that does not require knowledge of file length.

You can insert a modified "head" expression into the sliding loop of your "tail" expression. E.g.:

sed ':a; 10s/$/\n...File truncated.../p; $q; N; 11,$D; ba'

Note that if the ranges overlap there will be duplicate lines in the output.

Example:

seq 30 | sed ':a; 10s/$/\n...File truncated.../p; $q; N; 11,$D; ba'

Output:

1
2
3
4
5
6
7
8
9
10
...File truncated...
20
21
22
23
24
25
26
27
28
29
30

Here is a commented multi-line version to explain what is going on:

:a                                   # loop label
10s/$/\n...File truncated.../p       # on line 10, replace end of pattern space
$q                                   # quit here when on the last line
N                                    # read next line into pattern space
11,$D                                # from line 11 to end, delete the first line of pattern space
ba                                   # goto :a

thanks a lot, works like a charme, the overlapping is only a cosmetic issue, will investigate and try to understand this sed syntax, never used ":a , N;" etc. before with sed. — Nico Rittner, Feb 21 '14 at 11:04
@layer23: you're welcome, I have added a commented version by way of explanation. — Thor, Feb 21 '14 at 11:22

score 0 · Answer 5 · answered Feb 21 '14 at 09:44

0

if you know the length of the file

EndStart=$(( ${FileLen} - ${Y} + 1))
sed -n "1,${X} p
${X} a\\
 --- Truncated part ---
${EndStart},$ p" YourFile

answered Feb 21 '14 at 09:44

NeronLeVelu

9,908
1
23
43

score 0 · Answer 6 · answered Feb 21 '14 at 19:52

0

This might work for you (GNU sed):

sed '1,5b;:a;N;s/\n/&/8;Ta;$!D;s/[^\n]*\n//;i\*** truncated file ***' file

Here x=5 and Y=8.

N.B. This leaves short files unadulterated.

answered Feb 21 '14 at 19:52

potong

55,640
6
51
83

thanks a lot for that, an enhanced version of thor's answer. one small request: what would be the modification if counting of x starts after the first blank line. this would be handy for cutting plain-text rfc822-like email-files while keeping their headers intact. thanks in advance! – Nico Rittner Feb 25 '14 at 09:17

gnu sed - delete lines between first X and last Y lines

6 Answers6