0

How do I take a text file, skip the first 14 records, skip the last 2 records, put everything else into a new file?

I can write a simple Java or python code to do this, but I am looking for even simpler bash script/command line.

Anyone can help me on this?

user9524761
  • 731
  • 1
  • 5
  • 4

3 Answers3

2

You can do by using awk, if you wish to skip first N(14) and last M(2) lines

#!/bin/bash
awk -v N="$(wc -l < inputfile.txt)" -v first_lines=14 -v last_lines=2 '{
    if (NR >= first_lines && NR <= N - last_lines) 
    print $0
}' inputfile.txt >> outfile.txt 2>&1

Hope this helps you

ntshetty
  • 1,293
  • 9
  • 20
  • Appending to the output file, not writing it from the beginning? – Charles Duffy Apr 26 '18 at 03:12
  • @CharlesDuffy it appends from 15th line to last but 2 lines.let say lines 1 2 3 4 5 6 7 8 in inputfile.txt, if first 2 and last 2 filter, then outfile.txt contains 3 4 5 6 lines. – ntshetty Apr 26 '18 at 03:21
  • This doesn't work; it outputs line x if 14 <= x <= 2 (which is never true). You need to know how many lines there are in total before you can skip the last *k* lines. – chepner Apr 26 '18 at 03:47
  • @chepner Thanks, Good Catch, Updated the solution. – ntshetty Apr 26 '18 at 06:12
0

The simplest solution would a combination of head and tail:

# hf_filter - remove a header and footer of fixed length from the input
$ hf_filter () { tail -n +$(($1 + 1)) | head -n -$2; }
$ hf_filter 14 2 < old.txt > new.txt

However, this requires GNU head, as the standard version requires a positive integer as the argument for the -n option.

A solution to this problem requires buffering $last lines of output or prior knowledge of the length of the input. (GNU head does the buffering for you.) A standard awk solution might look like

awk -v h=14 -v t=2 'NR > h {buf[NR]=$0; s=NR-t} s in buf {print buf[s]; delete buf[s]}' old.txt > new.txt

The delete buf[s] isn't strictly necessary, but I think it should keep the memory usage constant (although I don't really know how awk manages memory allocations internally).

If you don't mind reading the input twice, you can get the input length if you don't already know it.

# Quotes are necessary; wc outputs leading spaces that break the assignment otherwise
awk -v n="$(wc -l < old.txt)" h=14 t=2 'NR > h && NR < n - t' old.txt > new.txt
chepner
  • 497,756
  • 71
  • 530
  • 681
-1

You can use sed:

lines=$(wc -l < inputfile)                                            
sed -n "15,$((lines-1))p" inputfile > outputfile
builder-7000
  • 7,131
  • 3
  • 19
  • 43