sed how to delete first 17 lines and last 8 lines in a file

Question

I have a big file 150GB CSV file and I would like to remove the first 17 lines and the last 8 lines. I have tried the following but seems that's not working right

sed -i -n -e :a -e '1,8!{P;N;D;};N;ba'

and

sed -i '1,17d'

I wonder if someone can help with sed or awk, one liner will be great?

I noticed the size is 150GB, how much free space do you still have on your disk? greater than 150GB? Is file in-place change necessary? — Kent, Feb 07 '13 at 13:43
I tried sed -i -n -e :a -e '1,8!{P;N;D;};N;ba' and sed -i '1,17d' but it doesn't seem that its working right. — Deano, Feb 07 '13 at 13:45
@user1007727 then all inter-media temp file solutions won't work for you. — Kent, Feb 07 '13 at 13:48
If you have less memory available than the size of your file then you need to do this in chunks that are smaller than the memory available, removing sections of your original as you write them to your new file. Even in-place editors like "ed" need to buffer the contents of your file to operate on it. — Ed Morton, Feb 07 '13 at 13:51
@user1007727 I suggest you writing this requirement in your question, that something like the target file is 150G, but you don't have 150Gb free space, how to edit that file. — Kent, Feb 07 '13 at 13:55
Any chance of not putting the first 17 lines and the last eight on the file in the first place? What happens to the file afterwards? Can the data be ignored whilst carrying out some other task on the file? — Bill Woodger, Feb 07 '13 at 23:18
possible duplicate of [How to delete first two lines and last four lines from a text file with bash?](http://stackoverflow.com/questions/10460919/how-to-delete-first-two-lines-and-last-four-lines-from-a-text-file-with-bash) The other is tool agnostic, and top answers here are not sed. — Ciro Santilli OurBigBook.com, Oct 16 '14 at 09:24

score 18 · Answer 1 · answered Feb 07 '13 at 13:36

18

head and tail are better for the job than sed or awk.

tail -n+18 file | head -n-8 > newfile

answered Feb 07 '13 at 13:36

choroba

231,213
25
204
289

score 10 · Accepted Answer · answered Feb 07 '13 at 13:46

10

awk -v nr="$(wc -l < file)" 'NR>17 && NR<(nr-8)' file

answered Feb 07 '13 at 13:46

Ed Morton

188,023
17
78
185

score 2 · Answer 3 · answered Mar 08 '13 at 01:31

2

All awk:

awk 'NR>y+x{print A[NR%y]} {A[NR%y]=$0}' x=17 y=8 file

answered Mar 08 '13 at 01:31

Scrutinizer

9,608
1
21
22

score 1 · Answer 4 · answered Feb 07 '13 at 13:37

Try this :

sed '{[/]<n>|<string>|<regex>[/]}d' <fileName>       
sed '{[/]<adr1>[,<adr2>][/]d' <fileName>

where

/.../=delimiters
n = line number
string = string found in in line
regex = regular expression corresponding to the searched pattern
addr = address of a line (number or pattern )
d = delete

Refer this link

score 0 · Answer 5 · edited Feb 07 '13 at 13:55

0

LENGTH=`wc -l < file`
head -n $((LENGTH-8)) file | tail -n $((LENGTH-17)) > file

Edit: As mtk posted in comment this won't work. If you want to use wc and track file length you should use:

LENGTH=`wc -l < file`
head -n $((LENGTH-8)) file | tail -n $((LENGTH-8-17)) > file

or:

LENGTH=`wc -l < file`
head -n $((LENGTH-8)) file > file
LENGTH=`wc -l < file`
tail -n $((LENGTH-17)) file > file

What makes this solution less elegant than that posted by choroba :)

edited Feb 07 '13 at 13:55

chepner

497,756
71
530
681

answered Feb 07 '13 at 13:37

Adam Sznajder

9,108
4
39
60

1

This seems to be errorneous, as the `tail` will operate on the output of `head`, resulting in wrong offset of rows being counted. – mtk Feb 07 '13 at 13:39

score 0 · Answer 6 · answered Feb 07 '13 at 14:17

0

I learnt this today for the shell.

{
  ghead -17  > /dev/null
  sed -n -e :a -e '1,8!{P;N;D;};N;ba'
} < my-bigfile > subset-of

One has to use a non consuming head, hence the use of ghead from the GNU coreutils.

answered Feb 07 '13 at 14:17

sotapme

4,695
2
19
20

score 0 · Answer 7 · answered Apr 09 '17 at 23:47

Similar to Thor's answer, but a bit shorter:

sed -i '' -e $'1,17d;:a\nN;19,25ba\nP;D' file.txt

The -i '' tells sed to edit the file in place. (The syntax may be a bit different on your system. Check the man page.)

If you want to delete front lines from the front and tail from the end, you'd have to use the following numbers:

1,{front}d;:a\nN;{front+2},{front+tail}ba\nP;D

(I put them in curly braces here, but that's just pseudocode. You'll have to replace them by the actual numbers. Also, it should work with {front+1}, but it doesn't on my machine (macOS 10.12.4). I think that's a bug.)

I'll try to explain how the command works. Here's a human-readable version:

1,17d     # delete lines 1 ... 17, goto start
:a        # define label a
N         # add next line from file to buffer, quit if at end of file
19,25ba   # if line number is 19 ... 25, goto start (label a)
P         # print first line in buffer
D         # delete first line from buffer, go back to start

First we skip 17 lines. That's easy. The rest is tricky, but basically we keep a buffer of eight lines. We only start printing lines when the buffer is full, but we stop printing when we reach the end of the file, so at the end, there are still eight lines left in the buffer that we didn't print - in other words, we deleted them.

sed how to delete first 17 lines and last 8 lines in a file

7 Answers7