1

I have seen How can I extract a predetermined range of lines from a text file on Unix? but I have a slightly different use case: I want to specify a starting line number, and a count/amount/number of lines to extract, from a text file.

So, I tried to generate a text file, and then compose an awk command to extract a count of 10 lines starting from line number 100 - but it does not work:

$ seq 1 500 > test_file.txt
$ awk 'BEGIN{s=100;e=$s+10;} NR>=$s&&NR<=$e' test_file.txt
$

So, what would be an easy approach to extract lines from a text file using a starting line number, and count of lines, in bash? (I'm ok with awk, sed, or any such tool, for instance in coreutils)

Cyrus
  • 84,225
  • 14
  • 89
  • 153
sdbbs
  • 4,270
  • 5
  • 32
  • 87

6 Answers6

3

This gives you text that is inclusive of both end points (eleven output lines, here).

$ START=100
$
$ sed -n "${START},$((START + 10))p"  < test_file.txt

The -n says "no print by default".

And then the p says "print this line", for lines within the example range of 100,110

J_H
  • 17,926
  • 4
  • 24
  • 44
2

When you want to use awk, use something like

seq 1 500 | awk 'NR>=100 && NR<=110'

Advantage of awk is the flexibility for changing the requirements.
When you want to use a variable start and skip the endpoints, it will be

start=100
seq 1 500 | awk -v start="${start}" 'NR > start && NR < start + 10'
Walter A
  • 19,067
  • 2
  • 23
  • 43
1

Another alternative with tail and head:

tail -n +$START test_file.txt | head -n $NUMBER

If test_file.txt is very large and $START and $NUMBER are small, the following variant should be the fastest:

head -n $((START+NUMBER)) test_file.txt | tail -n +$START

Anyway, I prefer the sed solution noticed above for short input files:

sed -n "$START,$((START+NUMBER)) p" test_file.txt 
Wiimm
  • 2,971
  • 1
  • 15
  • 25
0
sed -n "$Start,$End p" file

is likely a better way to get those lines.

TRCDev
  • 1
  • 3
0
$ seq 1 500 > test_file.txt
$ awk 'BEGIN{s=100;e=$s+10;} NR>=$s&&NR<=$e' test_file.txt
$

$s in GNU AWK means value of s-th field, $e in GNU AWK means value of e-th field. There are not fields yet in BEGIN clause so $s for any s is not set, as you use in arithemtic context it will be assumed to be 0 and therefore e will be set to value 10. Output of seq is single number per line, so there is not 10th field, so GNU AWK assumes it to be zero when asked to compare it with number, as NR is always strictly bigger than 0 your condition never holds so output is empty.

Use Range if you are able to prepare condition which holds solely for starting line and condition which holds solely for ending line, in this case

awk 'BEGIN{s=100}NR==s,NR==s+10' test_file.txt

gives output

100
101
102
103
104
105
106
107
108
109
110

Keep in mind that this will process whole file, if you have huge file and area of interest is relatively near begin, then you might decrease time consumption by ending processing at end of area of interest following way

awk 'BEGIN{s=100}NR>=s{print}NR==s+10{exit}' test_file.txt

(tested in GNU Awk 5.0.1)

Daweo
  • 31,313
  • 3
  • 12
  • 25
0

This command extracts 30 lines starting from line 100

sed -n '100,$p' test_file.txt | head -30
Francesco Gasparetto
  • 1,819
  • 16
  • 20