0

I have four files in my directory: say a.txt; b.txt; c.txt; d.txt. I would like to join every file with all other files based on two common columns (i.e. join a.txt with b.txt, c.txt and d.txt; join b.txt with a.txt, c.txt and d.txt; join c.txt with a.txt, b.txt and d.txt). To do this for two of the files I can do:

join -j 2 <(sort -k2 a.txt) <(sort -k2 b.txt) > a_b.txt

How do I write this in a loop for all files in the directory? I've tried the code below but that's not working.

for i j in *; do join -j 2 <(sort -k2 $i) <(sort -k2 $j) > ${i_j}.txt

Any help/direction would be helpful! Thank you.

aram
  • 23
  • 3

1 Answers1

0

This might be a way to do it:

#!/bin/bash


files=( *.txt )


for i in "${files[@]}";do

    for j in "${files[@]}";do

        if [[ "$i" != "$j" ]];then

            join -j 2  <(sort -k2 "$i") <(sort -k2 "$j") > "${i%.*}_$j"

        fi

    done

done
MauricioRobayo
  • 2,207
  • 23
  • 26
  • This works very well!!! Thank you so much. Would you be able to clarify 1) why you declare the files as an array and not just `for i in "${files}";do` and 2) why you have the o/p file named `"${i%.*}_$j"` and not "${i%.*_$j}"? I tried with the above two changes, but only your method worked. – aram Jul 02 '17 at 04:40
  • 1) The use of the array is to have in advance the files to loop through. Because we are creating files inside the loop the second loop will pick up new created files if we just use `for i in *.txt` or `files=*.txt`. If you use `"${files}"` you will only access the first element of the array. 2) The `"${i%.*}_$j"` is using bash substring removal, this is to remove the `.txt` just for `$i`. Here are some examples: https://stackoverflow.com/q/16623835/2002514 – MauricioRobayo Jul 02 '17 at 10:30
  • You can optimize this a lot if you sort the files beforehand and loop through the already sorted files. The way it is adds a lot of extra work because we are sorting the same files again and again as we loop through them. – MauricioRobayo Jul 02 '17 at 10:37
  • thank you for your explanation! very clear. i will sort all files before looping through. – aram Jul 03 '17 at 17:12