120

How can 2 unsorted text files of different lengths be display side by side (in columns) in a shell

Given one.txt and two.txt:

$ cat one.txt
apple
pear
longer line than the last two
last line

$ cat two.txt
The quick brown fox..
foo
bar 
linux

skipped a line

Display:

apple                               The quick brown fox..
pear                                foo
longer line than the last two       bar 
last line                           linux

                                    skipped a line

paste one.txt two.txt almost does the trick but doesn't align the columns nicely as it just prints one tab between column 1 and 2. I know how to this with emacs and vim but want the output displayed to stdout for piping ect.

The solution I came up with uses sdiff and then pipes to sed to remove the output sdiff adds.

sdiff one.txt two.txt | sed -r 's/[<>|]//;s/(\t){3}//'

I could create a function and stick it in my .bashrc but surely a command for this exists already (or a cleaner solution potentially)?

Chris Seymour
  • 83,387
  • 30
  • 160
  • 202

9 Answers9

208

You can use pr to do this, using the -m flag to merge the files, one per column, and -t to omit headers, eg.

pr -m -t one.txt two.txt

outputs:

apple                               The quick brown fox..
pear                                foo
longer line than the last two       bar
last line                           linux

                                    skipped a line

See Also:

Jay Taylor
  • 13,185
  • 11
  • 60
  • 85
Hasturkun
  • 35,395
  • 6
  • 71
  • 104
  • 19
    Perfect! Knew something would exist, never heard of `pr` before. I tried with 3 files and the output was truncated but the `-w` option solved that. Nice answer. – Chris Seymour Nov 12 '12 at 12:54
  • 5
    @sudo_o: Happy to help, coreutils is full of gems – Hasturkun Nov 12 '12 at 13:05
  • 1
    Is there a way for pr to auto-detect screen width? – Matt Apr 11 '14 at 18:59
  • 2
    @Matt: You could use `$COLUMNS`, which should be provided by the shell. – Hasturkun Aug 11 '14 at 15:38
  • 1
    When used to print two files side by side, `pr` cuts the end of long lines. Is there a way to make it wrap the lines? – molnarg Apr 08 '15 at 13:23
  • @molnarg: One method is to first fold each input file to half the page width, less any column-separator width. EG, in `bash`: `pr -t -m -w $PAGE_WIDTH <(fold -w $HALF_PAGE_WIDTH one.txt) <(fold -w $HALF_PAGE_WIDTH two.txt)`. This could be solved elegantly in user-created script. – crw Jun 23 '17 at 12:21
  • Is there a way to show 1 line from first file and 10 lines from the second file? Its machine translation outputs that I want to view side by side. – Ashutosh Baheti Jan 30 '18 at 17:30
  • `pr -mtJ ...` to use all available width. – Lichtgestalt Jul 17 '20 at 09:20
  • can I apply this for output of two processes instead of a file? – alper Nov 06 '20 at 16:12
  • Thanks for the hints here. I've used it in a bash script: pr -m -t -w$((2 * $(cat $1 $2|wc -L))) $1 $2 – Graham Toal Apr 02 '22 at 19:08
44

To expand a bit on @Hasturkun's answer: by default pr uses only 72 columns for its output, but it's relatively easy to make it use all available columns of your terminal window:

pr -w $COLUMNS -m -t one.txt two.txt

Most shells will store (and update) your terminal's screenwidth in the $COLUMNS shell variable, so we're just passing that value on to pr to use for its output's width setting.

This also answers @Matt's question:

Is there a way for pr to auto-detect screen width?

So, no: pr itself can't detect the screenwidth, but we're helping it out a bit by passing in the terminal's width via its -w option.

Note that $COLUMNS is a shell variable, not an environment variable, so it isn't exported to child processes, and hence the above approach will likely not work in scripts, only in interactive TTYs... see LINES and COLUMNS environmental variables lost in a script for alternative approaches.

pvandenberk
  • 4,649
  • 2
  • 26
  • 14
9

If you know the input files have no tabs, then using expand simplifies @oyss's answer:

paste one.txt two.txt | expand --tabs=50

If there could be tabs in the input files, you can always expand first:

paste <(expand one.txt) <(expand two.txt) | expand --tabs=50
Community
  • 1
  • 1
Bob
  • 91
  • 1
  • 3
6
paste one.txt two.txt | awk -F'\t' '{
    if (length($1)>max1) {max1=length($1)};
    col1[NR] = $1; col2[NR] = $2 }
    END {for (i = 1; i<=NR; i++) {printf ("%-*s     %s\n", max1, col1[i], col2[i])}
}'

Using * in a format specification allows you to supply the field length dynamically.

Barmar
  • 741,623
  • 53
  • 500
  • 612
4

If you want to know the actual difference between two files side by side, use diff -y:

diff -y file1.cf file2.cf

You can also set an output width using the -W, --width=NUM option:

diff -y -W 150 file1.cf file2.cf

and to make diff's column output fit your current terminal window:

diff -y -W $COLUMNS file1.cf file2.cf
Alex C
  • 11
  • 1
  • 3
Shrey
  • 61
  • 2
3

There is a sed way:

f1width=$(wc -L <one.txt)
f1blank="$(printf "%${f1width}s" "")"
paste one.txt two.txt |
    sed "
        s/^\(.*\)\t/\1$f1blank\t/;
        s/^\(.\{$f1width\}\) *\t/\1 /;
    "

Under , you could use printf -v:

f1width=$(wc -L <one.txt)
printf -v f1blank "%${f1width}s"
paste one.txt two.txt |
    sed "s/^\(.*\)\t/\1$f1blank\t/;
         s/^\(.\{$f1width\}\) *\t/\1 /;"

(Of course @Hasturkun 's solution pr is the most accurate!):

Advantage of sed over pr

You can finely choose separation width and or separators:

f1width=$(wc -L <one.txt)
(( f1width += 4 ))         # Adding 4 spaces
printf -v f1blank "%${f1width}s"
paste one.txt two.txt |
    sed "s/^\(.*\)\t/\1$f1blank\t/;
         s/^\(.\{$f1width\}\) *\t/\1 /;"

Or, for sample, to mark lines containing line:

f1width=$(wc -L <one.txt)
printf -v f1blank "%${f1width}s"
paste one.txt two.txt |
    sed "s/^\(.*\)\t/\1$f1blank\t/;
  /line/{s/^\(.\{$f1width\}\) *\t/\1 |ln| /;ba};
         s/^\(.\{$f1width\}\) *\t/\1 |  | /;:a"

will render:

apple                         |  | The quick brown fox..
pear                          |  | foo
longer line than the last two |ln| bar 
last line                     |ln| linux
                              |  | 
                              |ln| skipped a line
F. Hauri - Give Up GitHub
  • 64,122
  • 17
  • 116
  • 137
2

remove dynamically field length counting from Barmar's answer will make it a much shorter command....but you still need at least one script to finish the work which could not be avoided no matter what method you choose.

paste one.txt two.txt |awk -F'\t' '{printf("%-50s %s\n",$1,$2)}'
oyss
  • 662
  • 1
  • 8
  • 20
2

Find below a python based solution.

import sys

# Specify the number of spaces between the columns
S = 4

# Read the first file
l0 = open( sys.argv[1] ).read().split('\n')

# Read the second file
l1 = open( sys.argv[2] ).read().split('\n')

# Find the length of the longest line of the first file
n = len(max(l0, key=len))

# Print the lines
for i in  xrange( max( len(l0), len(l1) ) ):

    try:
        print l0[i] + ' '*( n - len(l0[i]) + S) + l1[i]
    except:
        try:
            print ' ' + ' '*( n - 1 + S) + l1[i]
        except:
            print l0[i]

Example

apple                            The quick brown fox..
pear                             foo
longer line than the last two    bar 
last line                        linux

                                 skipped a line
funk
  • 2,221
  • 1
  • 24
  • 23
0
diff -y <file1> <file2>


[root /]# cat /one.txt
apple
pear
longer line than the last two
last line
[root /]# cat /two.txt
The quick brown fox..
foo
bar
linux
[root@RHEL6-64 /]# diff -y one.txt two.txt
apple                                                         | The quick brown fox..
pear                                                          | foo
longer line than the last two                                 | bar
last line                                                     | linux
Nizam
  • 5,698
  • 9
  • 45
  • 57
Vikas Jain
  • 51
  • 4