4

A file contains ASCII characters to represent hex value. (each line ends in ", ")

cat temp.txt
    0x6A, 
    0xF2, 
    0x44,
    .....
    0xF8, 
    0x1A,

I try to combines each 16 words/lines into one line like this

cat hex_result.txt 
0x6A, 0xF2, 0x44, 0xF8, 0x45, 0x41, 0x88, 0xD1,0x4E, 0x8B, 0xA3, 0xB1, 0x8C, 0xE0, 0x37, 0x2D, 
.... 
0xE2, 0x1C, 0x06, 0x8A, 0x75, 0x2B, 0xBC, 0x3C, 0xC5, 0x08, 0xB7, 0x4E, 0xB0, 0xE4, 0xF8, 0x1A,

Is any bash commands to accomplish it ?

orionlin
  • 167
  • 1
  • 5
  • 2
    `awk 'ORS=NR%16?FS:RS'` , If you are aware of `ORS FS NR RS` – P.... Nov 17 '17 at 10:14
  • possible duplicate, to merge every n lines: https://stackoverflow.com/questions/9605232/how-to-merge-every-two-lines-into-one-from-the-command-line and https://stackoverflow.com/questions/25973140/joining-every-group-of-n-lines-into-one-with-bash – thanasisp Nov 17 '17 at 10:32
  • @thanasisp None of your duplicate purpose work with a *number of lines* – F. Hauri - Give Up GitHub Nov 17 '17 at 12:08

1 Answers1

17

Benchmarking six different merging methods,

for merging specific number of lines.

Basicaly, there are many commands:

pr - convert text files for printing

pr -at16 <file

Try:

pr -a -t -16 < <(seq 1 42)
1   2   3   4   5   6   7   8   9   10  11  12  13  14  15  16
17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32
33  34  35  36  37  38  39  40  41  42

xargs - build and execute command lines from standard input

... and executes the command (default is /bin/echo) ...

xargs -n 16 <file

Try:

xargs -n 16 < <(seq 1 42)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40 41 42

paste - merge lines of files

printf -v pasteargs %*s 16
paste -d\  ${pasteargs// /- } < <(seq 1 42)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40 41 42

sed - stream editor for filtering and transforming text

printf -v sedstr 'N;s/\\n/ /;%.0s' {2..16};
sed -e "$sedstr" < <(seq 1 42)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40 41 42

awk - pattern scanning and processing language

awk 'NR%16{printf "%s ",$0;next;}1'  < <(seq 1 42)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40 41 42

But, you could use pure :

group=()
while read -r line;do
    group+=("$line")
    (( ${#group[@]} > 15 ))&&{
        echo "${group[*]}"
        group=()
    }
  done < <(seq 1 42) ; echo "${group[*]}"
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40 41 42

or as a function:

lgrp () { 
    local group=() line
    while read -r line; do
        group+=("$line")
        ((${#group[@]}>=$1)) && { 
            echo "${group[*]}"
            group=()
        }
    done
    [ "$group" ] && echo "${group[*]}"
}

or

lgrp () { local g=() l;while read -r l;do g+=("$l");((${#g[@]}>=$1))&&{
          echo "${g[*]}";g=();};done;[ "$g" ] && echo "${g[*]}";}

then

lgrp 16 < <(seq 1 42)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40 41 42

( Note: All this tests was arbitrarily done over 42 values, don't ask my why! ;-)

Other languages

Of course, by using any language, you could do same:

perl -ne 'chomp;$r.=$_." ";( 15 < ++$cnt) && do {
    printf "%s\n", $1 if $r =~ /^(.*) $/;$r="";$cnt=0;
  };END{print $r."\n"}' < <(seq 1 42)

Like python, ruby, lisp, C, ...

Comparison of execution time.

Ok, there are more than 3 simple ways, let do a little bench. How I do it:

lgrp () { local g=() l;while read -r l;do g+=("$l");((${#g[@]}>=
 $1))&&{ echo "${g[*]}";g=();};done;[ "$g" ] && echo "${g[*]}";}
export -f lgrp
printf -v sedcmd '%*s' 15
sedcmd=${sedcmd// /N;s/\\n/ /;}
export sedcmd
{ 
    printf "%-12s\n" Method
    printf %7s\\n count real user system count real user system

    for cmd in 'paste -d " " -{,,,}{,,,}' 'pr -at16' \
        'sed -e "$sedcmd"' \
        $'awk \47NR%16{printf "%s ",$0;next;}1;END{print ""}\47'\
        $'perl -ne \47chomp;$r.=$_." ";( 15 < ++$cnt) && do {
           printf "%s\n", $1 if $r =~ /^(.*) $/;$r="";$cnt=0;
           };END{print $r."\n"}\47' 'lgrp 16' 'xargs -n 16'
    do
        printf %-12s\\n ${cmd%% *}
        for length in 5042 50042; do
            printf %7s\\n $(bash -c "TIMEFORMAT=$'%R %U %S';
                time $cmd < <(seq 1 $length) | wc -l" 2>&1)
        done
    done
} | paste -d $'\t' -{,,,,,,,,}

(This could be cut'n paste in a terminal). Produce, on my computer:

Method        count    real    user  system   count    real    user  system
paste           316   0.002   0.002   0.000    3128   0.003   0.003   0.000
pr              316   0.003   0.000   0.003    3128   0.008   0.005   0.002
sed             316   0.005   0.001   0.003    3128   0.018   0.019   0.000
awk             316   0.003   0.001   0.003    3128   0.017   0.017   0.002
perl            316   0.008   0.002   0.004    3128   0.017   0.014   0.004
lgrp            316   0.058   0.042   0.021    3128   0.733   0.568   0.307
xargs           316   0.232   0.178   0.058    3128   2.249   1.730   0.555

There is same bench on my raspberry pi:

Method        count    real    user  system   count    real    user  system
paste           316   0.149   0.032   0.012    3128   0.204   0.014   0.054
pr              316   0.163   0.017   0.038    3128   0.418   0.069   0.096
sed             316   0.275   0.088   0.031    3128   1.586   0.697   0.045
awk             316   0.440   0.146   0.049    3128   2.809   1.305   0.050
perl            316   0.421   0.122   0.040    3128   2.042   0.902   0.067
lgrp            316   7.261   3.159   0.446    3128  71.733  31.223   3.558
xargs           316   9.464   3.038   1.066    3128  93.607  32.035   9.177

Hopefully all line count are same, then paste are clearly the quicker, followed by pr. Pure function is not slower than xargs (I'm surprised about poor performance of xargs!).

F. Hauri - Give Up GitHub
  • 64,122
  • 17
  • 116
  • 137
  • 1
    xargs -n 16 – orionlin Nov 21 '17 at 07:22
  • 1
    `pr -at` approach didn't work for me. It produced garbled output. – codeforester Nov 02 '18 at 18:19
  • @codeforester care to not use space between `-at` or specify columns number with a dash: `pr -at12 < <(seq 1 20)` or `pr -a -t -12 < <(seq 1 20)` – F. Hauri - Give Up GitHub Nov 04 '18 at 21:47
  • 1
    You made a mistake in sed proposed solution. printf writes to sedstr and you later read from sedcmd. It should be: `printf -v sedcmd 'N;s/\\n/ /;%.0s' {2..16}; sed -e "$sedcmd" < <(seq 1 42)` – robertm.tum Oct 28 '20 at 12:03
  • @robertm.tum Correct, thanks! Topy corrected. – F. Hauri - Give Up GitHub Oct 28 '20 at 13:32
  • Explanation of AWK command : Why the 1 ? : https://stackoverflow.com/questions/24643240/what-does-a-number-do-after-curly-braces 1 is another pattern that return true, then execute the default {action} (because there is none specified) that ... print the line (which is the next one given the previous next command that stop the previous execution). – Nicolas Thery Sep 26 '22 at 13:56
  • Interestingly, I tried testing GNU parallel using your script (parallel -n 16 echo) and it was *much* slower than any of the others. It's very flexible and simple, though, and I'm curious if there's a more clever way to invoke it for efficiency: `perl 316 0.012 0.007 0.006 3128 0.022 0.027 0.006 parallel 316 2.175 1.156 1.962 3128 19.726 10.559 17.965 xargs 316 0.919 0.150 0.304 3128 8.989 1.444 2.902` – Joshua Goldberg Jan 20 '23 at 16:00
  • @JoshuaGoldberg Using *parallel* for this is a wrong good idea! You initiate more process than your host have, then you may become strange results! Try this: `for i in {1..30};do parallel -n 16 echo < <(seq 1 10000) | sha1sum ;done` This will take some time and must output `a56d957604366f5c6349f3b39ec277d53576e10e` 30 times, but on my host there are some differents results!!! – F. Hauri - Give Up GitHub Jan 20 '23 at 17:03
  • @JoshuaGoldberg I do a lot of this in parallel, but I never used `parallel`. Just installed it now for testing... Samples: [paralleliting `rsync` for speed them up](https://stackoverflow.com/a/72480364/1765658), or [piping output to many process](https://stackoverflow.com/a/19125525/1765658), or even [complex mandelbrot using many parallelized `bc` in background](https://stackoverflow.com/a/67498861/1765658) – F. Hauri - Give Up GitHub Jan 20 '23 at 17:07