Easy way of selecting certain lines from a file in a certain order

Question

I have a text file, with many lines. I also have a selected number of lines I want to print out, in certain order. Let's say, for example, "5, 3, 10, 6". In this order.

Is there some easy and "canonical" way of doing this? (with "standard" Linux tools, and bash)

When I tried the answers from this question

Bash tool to get nth line from a file

it always prints the lines in order they are in the file.

score 3 · Answer 1 · answered Jun 11 '14 at 14:19

3

A one liner using sed:

for i in 5 3 10 6 ; do  sed -n "${i}p" < ff; done

answered Jun 11 '14 at 14:19

John C

4,276
2
17
28

for i (5 3 10 6) sed -n "${i}p" < ff # same thing zsh (above also works) – zzapper Jun 11 '14 at 15:50

score 2 · Answer 2 · answered Jun 11 '14 at 14:16

2

Here is one way using awk:

awk -v s='5,3,10,6' 'BEGIN{split(s, a, ","); for (i=1; i<=length(a); i++) b[a[i]]=i}
        b[NR]{data[NR]=$0} END{for (i=1; i<=length(a); i++) print data[a[i]]}' file

Testing:

cat file
Line 1
Line 2
Line 3
Line 4
Line 5
Line 6
Line 7
Line 8
Line 9
Line 10
Line 11
Line 12

awk -v s='5,3,10,6' 'BEGIN{split(s, a, ","); for (i=1; i<=length(a); i++) b[a[i]]=i}
        b[NR]{data[NR]=$0} END{for (i=1; i<=length(a); i++) print data[a[i]]}' file
Line 5
Line 3
Line 10
Line 6

answered Jun 11 '14 at 14:16

anubhava

761,203
64
569
643

Wow, this is all so complex. I assumed there will be something simple, given that the unix tools are "made" for text processing – Karel Bílek Jun 11 '14 at 14:19
It might look complicated but this is only way I know to get this done using a **single command**. Reason of complexity is that all the tools process input data line by line so getting output in a pre-defined manner one needs to first process the file and then print in designated order. – anubhava Jun 11 '14 at 14:23
Also I suggest running tests with all suggested solution on a very big file. I have added extra bit of code here to make sure I only cache given line numbers in memory rather than caching all the files. – anubhava Jun 11 '14 at 14:36
1

+1, but this is somewhat simpler: `awk -v s="5,3,10,6" '{line[NR]=$0} END {n = split(s, l, ","); for (i=1; i<=n; i++) print line[l[i]]}' file` -- it does need to store the whole file in memory, however. – glenn jackman Jun 11 '14 at 15:02
Yes I wrote a simpler awk command earlier but then put more logic into it to avoid storing whole file into memory. – anubhava Jun 11 '14 at 15:27

gniourf_gniourf · Accepted Answer · 2014-06-11T15:37:07.040

A rather efficient method if your file is not too large is to read it all in memory, in an array, one line per field using mapfile (this is a Bash ≥4 builtin):

mapfile -t array < file.txt

Then you can echo all the lines you want in any order, e.g.,

printf '%s\n' "${array[4]}" "${array[2]}" "${array[9]}" "${array[5]}"

to print the lines 5, 3, 10, 6. Now you'll feel it's a bit awkward that the array fields start with a 0 so that you have to offset your numbers. This can be easily cured with the -O option of mapfile:

mapfile -t -O 1 array < file.txt

this will start assigning to array at index 1, so that you can print your lines 5, 3, 10 and 6 as:

printf '%s\n' "${array[5]}" "${array[3]}" "${array[10]}" "${array[6]}"

Finally, you want to make a wrapper function for this:

printlines() {
    local i
    for i; do printf '%s\n' "${array[i]}"; done
}

so that you can just state:

printlines 5 3 10 6

And it's all pure Bash, no external tools!

As @glennjackmann suggests in the comments you can make the helper function also take care of reading the file (passed as argument):

printlinesof() {
    # $1 is filename
    # $2,... are the lines to print
    local i array
    mapfile -t -O 1 array < "$1" || return 1
    shift
    for i; do printf '%s\n' "${array[i]}"; done
}

Then you can use it as:

printlinesof file.txt 5 3 10 6

And if you also want to handle stdin:

printlinesof() {
    # $1 is filename or - for stdin
    # $2,... are the lines to print
    local i array file=$1
    [[ $file = - ]] && file=/dev/stdin
    mapfile -t -O 1 array < "$file" || return 1
    shift
    for i; do printf '%s\n' "${array[i]}"; done
}

so that

printf '%s\n' {a..z} | printlinesof - 5 3 10 6

will also work.

+1 very nice. Requires bash v4 for `mapfile`. I'd enhance that by also passing the filename and performing the mapfile in the function: `printlines() { local i array; mapfile -t -O 1 array < "$1"; shift; for i; do printf '%s\n' "${array[i]}"; done; }; printlines file.txt 5 3 10 6` — glenn jackman, Jun 11 '14 at 15:16
I like this answer the most, even when it doesn't "scale" if the file is too big. — Karel Bílek, Jun 13 '14 at 00:21

choroba · Answer 4 · 2014-06-11T14:46:21.423

First, generate a sed expression that would print the lines with a number at the beginning that you can later use to sort the output:

#!/bin/bash
lines=(5 3 10 6)
sed=''
i=0
for line in "${lines[@]}" ; do
    sed+="${line}s/^/$((i++)) /p;"
done

for i in {a..z} ; do echo $i ; done \
    | sed -n "$sed" \
    | sort -n \
    | cut -d' ' -f2-

I's probably use Perl, though:

for c in {a..z} ; do echo $c ; done \
| perl -e 'undef @lines{@ARGV};
           while (<STDIN>) {
               $lines{$.} = $_ if exists $lines{$.};
           }
           print @lines{@ARGV};
          ' 5 3 10 6

You can also use Perl instead of hacking with sed in the first solution:

for c in {a..z} ; do echo $c ; done \
| perl -e ' %lines = map { $ARGV[$_], ++$i } 0 .. $#ARGV;
            while (<STDIN>) {
                print "$lines{$.} $_" if exists $lines{$.};
            }
          ' 5 3 10 6 | sort -n | cut -d' ' -f2-

score 0 · Answer 5 · answered Jun 11 '14 at 15:15

0

l=(5 3 10 6)
printf "%s\n" {a..z} | 
sed -n "$(printf "%d{=;p};" "${l[@]}")" | 
paste - - | {
    while IFS=$'\t' read -r nr text; do 
        line[nr]=$text
    done
    for n in "${l[@]}"; do
        echo "${line[n]}"
    done
}

answered Jun 11 '14 at 15:15

glenn jackman

238,783
38
220
352

score 0 · Answer 6 · answered Jun 16 '14 at 22:06

You can use the nl trick: number the lines in the input and join the output with the list of actual line numbers. Additional sorts are needed to make the join possible as it needs sorted input (so the nl trick is used once more the number the expected lines):

#! /bin/bash

LINES=(5 3 10 6)

lines=$( IFS=$'\n' ; echo "${LINES[*]}" | nl )

for c in {a..z} ; do
    echo $c
done | nl \
    | grep -E '^\s*('"$( IFS='|' ; echo "${LINES[*]}")"')\s' \
    | join -12 -21 <(echo "$lines" | sort -k2n) - \
    | sort -k2n \
    | cut -d' ' -f3-

Easy way of selecting certain lines from a file in a certain order

6 Answers6