1

I have a file with unknown number of lines(but even number of lines). I want to print them side by side based on total number of lines in that file. For example, I have a file with 16 lines like below:

asdljsdbfajhsdbflakjsdff235
asjhbasdjbfajskdfasdbajsdx3
asjhbasdjbfajs23kdfb235ajds
asjhbasdjbfajskdfbaj456fd3v
asjhbasdjb6589fajskdfbaj235
asjhbasdjbfajs54kdfbaj2f879
asjhbasdjbfajskdfbajxdfgsdh
asjhbasdf3709ddjbfajskdfbaj
100
100
150
125
trh77rnv9vnd9dfnmdcnksosdmn
220
225
sdkjNSDfasd89asdg12asdf6asdf

So now i want to print them side by side. as they have 16 lines in total, I am trying to get the results 8:8 like below

asdljsdbfajhsdbflakjsdff235 100
asjhbasdjbfajskdfasdbajsdx3 100
asjhbasdjbfajs23kdfb235ajds 150
asjhbasdjbfajskdfbaj456fd3v 125
asjhbasdjb6589fajskdfbaj235 trh77rnv9vnd9dfnmdcnksosdmn
asjhbasdjbfajs54kdfbaj2f879 220
asjhbasdjbfajskdfbajxdfgsdh 225
asjhbasdf3709ddjbfajskdfbaj sdkjNSDfasd89asdg12asdf6asdf

paste command did not work for me exactly, (paste - - - - - - - -< file1) nor the awk command that I used awk '{printf "%s" (NR%2==0?RS:FS),$1}' Note: The number of lines in a file dynamic. The only known thing in my scenario is, they are even number all the time.

Paul
  • 71
  • 5

6 Answers6

2

Extract the first half of the file and the last half of the file and merge the lines:

paste <(head -n $(($(wc -l <file.txt)/2)) file.txt) <(tail -n $(($(wc -l <file.txt)/2)) file.txt)

You can use columns utility from autogen:

columns -c2 --by-columns file.txt

You can use column, but the count of columns is calculated in a strange way from the count of columns of your terminal. So assuming your lines have 28 characters, you also can:

column -c $((28*2+8)) file.txt
KamilCuk
  • 120,984
  • 8
  • 59
  • 111
2

You can also do it with awk simply by storing the first-half of the lines in an array and then concatenating the second half to the end, e.g.

awk -v nlines=$(wc -l < file) -v j=0 'FNR<=nlines/2{a[++i]=$0; next} j<i{print a[++j],$1}' file

Example Use/Output

With your data in file, then

$ awk -v nlines=$(wc -l < file) -v j=0 'FNR<=nlines/2{a[++i]=$0; next} j<i{print a[++j],$1}' file
asdljsdbfajhsdbflakjsdff235 100
asjhbasdjbfajskdfasdbajsdx3 100
asjhbasdjbfajs23kdfb235ajds 150
asjhbasdjbfajskdfbaj456fd3v 125
asjhbasdjb6589fajskdfbaj235 trh77rnv9vnd9dfnmdcnksosdmn
asjhbasdjbfajs54kdfbaj2f879 220
asjhbasdjbfajskdfbajxdfgsdh 225
asjhbasdf3709ddjbfajskdfbaj sdkjNSDfasd89asdg12asdf6asdf
David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
  • I just wanted to add that in freebsd's awk if you dont quote the variable assignment using the -v option you will get an error `awk: can't open file -v source line number 1` – Jetchisel Feb 26 '20 at 02:24
  • Good answer, just a couple of tweaks: awk fields, strings, and arrays all start at 1, not 0, so don't create user-defined arrays that start at 0 so that doesn't trip you up one day: `{a[++i]=$0; next} j – Ed Morton Feb 26 '20 at 14:59
  • 1
    Thank you @EdMorton always good to avoid wrapping myself around the axle later. Though my mind it torn between C and zero based indexes and awk and one based indexes. In one-liners, it generally doesn't give me grief, but I do see the wisdom in your words of always providing standard indexing -- eliminates the potential for any surprises later. – David C. Rankin Feb 26 '20 at 18:23
2

If you have the memory to hash the whole file ("max" below):

$ awk '{
    a[NR]=$0                 # hash all the records
}
END {                        # after hashing
    mid=int(NR/2)            # compute the midpoint, int in case NR is uneven
    for(i=1;i<=mid;i++)      # iterate from start to midpoint
        print a[i],a[mid+i]  # output
}' file

If you have the memory to hash half of the file ("mid"):

$ awk '
NR==FNR {                           # on 1st pass hash second half of records
    if(FNR>1) {                     # we dont need the 1st record ever
        a[FNR]=$0                   # hash record
        if(FNR%2)                   # if odd record
            delete a[int(FNR/2)+1]  # remove one from the past
    }
    next
}
FNR==1 {                            # on the start of 2nd pass
    if(NR%2==0)                     # if record count is uneven
        exit                        # exit as there is always even count of them
    offset=int((NR-1)/2)            # compute offset to the beginning of hash
}
FNR<=offset {                       # only process the 1st half of records
    print $0,a[offset+FNR]          # output one from file, one from hash
    next
}
{                                   # once 1st half of 2nd pass is finished
    exit                            # just exit
}' file file                        # notice filename twice

And finally if you have awk compiled into a worms brain (ie. not so much memory, "min"):

$ awk '
NR==FNR {                                       # just get the NR of 1st pass
    next
}
FNR==1 {                                       
    mid=(NR-1)/2                                # get the midpoint
    file=FILENAME                               # filename for getline
    while(++i<=mid && (getline line < file)>0); # jump getline to mid
}
{
    if((getline line < file)>0)                 # getline read from mid+FNR
        print $0,line                           # output
}' file file                                    # notice filename twice

Standard disclaimer on getline and no real error control implemented.

Performance:

I seq 1 100000000 > file and tested how the above solutions performed. Output was > /dev/null but writing it to a file lasted around 2 s longer. max performance is so-so as the mem print was 88 % of my 16 GB so it might have swapped. Well, I killed all the browsers and shaved off 7 seconds for the real time of max.

+------------------+-----------+-----------+
| which            |           |           |
|              min |       mid |       max |
+------------------+-----------+-----------+
| time             |           |           |
| real    1m7.027s | 1m30.146s | 0m48.405s |
| user    1m6.387s | 1m27.314  | 0m43.801s |
| sys     0m0.641s | 0m2.820s  | 0m4.505s  |
+------------------+-----------+-----------+
| mem              |           |           |
|             3 MB |    6.8 GB |   13.5 GB |
+------------------+-----------+-----------+

Update:

I tested @DavidC.Rankin's and @EdMorton's solutions and they ran, respectively:

real    0m41.455s
user    0m39.086s
sys     0m2.369s

and

real    0m39.577s
user    0m37.037s
sys     0m2.541s

Mem print was about the same as my mid had. It pays to use the wc, it seems.

James Brown
  • 36,089
  • 7
  • 43
  • 59
2
$ pr -2t file

asdljsdbfajhsdbflakjsdff235         100
asjhbasdjbfajskdfasdbajsdx3         100
asjhbasdjbfajs23kdfb235ajds         150
asjhbasdjbfajskdfbaj456fd3v         125
asjhbasdjb6589fajskdfbaj235         trh77rnv9vnd9dfnmdcnksosdmn
asjhbasdjbfajs54kdfbaj2f879         220
asjhbasdjbfajskdfbajxdfgsdh         225
asjhbasdf3709ddjbfajskdfbaj         sdkjNSDfasd89asdg12asdf6asdf

if you want just one space between columns, change to

$ pr -2ts' ' file
karakfa
  • 66,216
  • 7
  • 41
  • 56
0

I do not want to solve this, but if I were you:

wc -l file.txt 

gives number of lines

echo $(($(wc -l < file.txt)/2))

gives a half

head -n $(($(wc -l < file.txt)/2)) file.txt  > first.txt
tail -n $(($(wc -l < file.txt)/2)) file.txt  >  last.txt

create file with first half and last half of the original file. Now you can merge those files together side by side as it was described here .

schweik
  • 164
  • 7
  • `echo $(($(wc -l file.txt)/2))` will error, `wc -l file.txt` outputs two columns ` file.txt`. Redirect the input `wc -l – KamilCuk Feb 25 '20 at 23:03
  • Sorry, wc outputs the name of the file, what we can prevent by _anonymizing_ the the input with redirecting – schweik Feb 25 '20 at 23:14
0

Here is my take on it using the bash shell wc(1) and ed(1)

#!/usr/bin/env bash

array=()
file=$1 
total=$(wc -l < "$file")
half=$(( total / 2 ))
plus1=$(( half + 1 ))

for ((m=1;m<=half;m++)); do
  array+=("${plus1}m$m" "${m}"'s/$/ /' "${m}"',+1j')
done

After all of that if just want to print the output to stdout. Add the line below to the script.

printf '%s\n' "${array[@]}" ,p Q | ed -s "$file"

If you want to write the changes directly to the file itself, Use this code instead below the script.

printf '%s\n' "${array[@]}" w | ed -s "$file"

Here is an example.

printf '%s\n' {1..10} > file.txt

Now running the script against that file.

./myscript file.txt

Output

1 6
2 7
3 8
4 9
5 10

Or using bash4+ feature mapfile aka readarray

Save the file in an array named array.

mapfile -t array < file.txt

Separate the files.

left=("${array[@]::((${#array[@]} / 2))}") right=("${array[@]:((${#array[@]} / 2 ))}")

loop and print side-by-side

for i in "${!left[@]}"; do
  printf '%s %s\n' "${left[i]}" "${right[i]}"
done

What you said The only known thing in my scenario is, they are even number all the time. That solution should work.

Jetchisel
  • 7,493
  • 2
  • 19
  • 18