2

the goal is to shorten a large text:
delete everything between the first X lines and the last Y lines
and maybe insert a line like "file truncated to XY lines..." in the middle.
i played around and achieved this with weird redirections ( Pipe output to two different commands ), subshells, tee and multiple sed invocations and i wonder if

sed -e '10q'

and

sed -e :a -e '$q;N;11,$D;ba'

can be simplified by merging both into a single sed call.

thanks in advance

Community
  • 1
  • 1
Nico Rittner
  • 189
  • 5

6 Answers6

2

Use head and tail:

(head -$X infile; echo Truncated; tail -$Y infile) > outfile

Or awk:

awk -v x=$x -v y=$y '{a[++i]=$0}END{for(j=1;j<=x;j++)print a[j];print "Truncated"; for(j=i-y;j<=i;j++)print a[j]}' yourfile

Or you can use tee like this with process substitution if, as you say, input is coming from a pipe:

yourcommand | tee >(head -$x > p1) | tail -$y > p2 ; cat p[12]
Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
1

You can do it through a magical incantation of tee, process substitutions, and stdio redirections:

x=5 y=8
seq 20 | { 
    tee >(tail -n $y >&2) \
        >({ head -n $x; echo "..."; } >&2) >/dev/null 
} 2>&1
1
2
3
4
5
...
13
14
15
16
17
18
19
20

This version is more sequential and the output should be consistent:

x=5 y=8
seq 20 | {
    { 
        # read and print the first X lines to stderr
        while ((x-- > 0)); do 
            IFS= read -r line 
            echo "$line" 
        done >&2
        echo "..." >&2  
        # send the rest of the stream on stdout
        cat - 
    } |
    # print the last Y lines to stderr, other lines will be discarded
    tail -n $y >&2
} 2>&1
glenn jackman
  • 238,783
  • 38
  • 220
  • 352
  • Glenn: I was heading in that direction with my 3rd answer, but was unsure how to guarantee the ordering of outputs from the "head" and "tail". You appear to have merged stdout and stderr to achieve that, but what subtlety that I am missing guarantees the order? – Mark Setchell Feb 21 '14 at 14:00
  • I don't know if you can guarantee it. Maybe `>(sleep 1; tail -n $y)` would be long enough that the "head" branch would be sure to finish first. The safest way would be to write to a file. Or maybe a fifo would be better... – glenn jackman Feb 21 '14 at 15:08
  • Cool - I knew there was a reason I went with files :-) – Mark Setchell Feb 21 '14 at 15:41
1

You can also use sed -u 5q (with GNU sed) as an unbuffered alternative to head -n5:

$ seq 99|(sed -u 5q;echo ...;tail -n5)
1
2
3
4
5
...
95
96
97
98
99
nisetama
  • 7,764
  • 1
  • 34
  • 21
0

Here is a sed alternative that does not require knowledge of file length.

You can insert a modified "head" expression into the sliding loop of your "tail" expression. E.g.:

sed ':a; 10s/$/\n...File truncated.../p; $q; N; 11,$D; ba'

Note that if the ranges overlap there will be duplicate lines in the output.

Example:

seq 30 | sed ':a; 10s/$/\n...File truncated.../p; $q; N; 11,$D; ba'

Output:

1
2
3
4
5
6
7
8
9
10
...File truncated...
20
21
22
23
24
25
26
27
28
29
30

Here is a commented multi-line version to explain what is going on:

:a                                   # loop label
10s/$/\n...File truncated.../p       # on line 10, replace end of pattern space
$q                                   # quit here when on the last line
N                                    # read next line into pattern space
11,$D                                # from line 11 to end, delete the first line of pattern space
ba                                   # goto :a
Thor
  • 45,082
  • 11
  • 119
  • 130
  • thanks a lot, works like a charme, the overlapping is only a cosmetic issue, will investigate and try to understand this sed syntax, never used ":a , N;" etc. before with sed. – Nico Rittner Feb 21 '14 at 11:04
  • @layer23: you're welcome, I have added a commented version by way of explanation. – Thor Feb 21 '14 at 11:22
0

if you know the length of the file

EndStart=$(( ${FileLen} - ${Y} + 1))
sed -n "1,${X} p
${X} a\\
 --- Truncated part ---
${EndStart},$ p" YourFile
NeronLeVelu
  • 9,908
  • 1
  • 23
  • 43
0

This might work for you (GNU sed):

sed '1,5b;:a;N;s/\n/&/8;Ta;$!D;s/[^\n]*\n//;i\*** truncated file ***' file

Here x=5 and Y=8.

N.B. This leaves short files unadulterated.

potong
  • 55,640
  • 6
  • 51
  • 83
  • thanks a lot for that, an enhanced version of thor's answer. one small request: what would be the modification if counting of x starts after the first blank line. this would be handy for cutting plain-text rfc822-like email-files while keeping their headers intact. thanks in advance! – Nico Rittner Feb 25 '14 at 09:17