As an extension of this question , I would now like to have not only the filename, but the directories up to k positions back. Here's the problem:
I have directories named RUN1
, RUN2
, and RUN3
Each directory has some files. Directory RUN1
has files mod1_1.csv
, mod1_2.csv
, mod1_3.csv
. Directory RUN2
has files mod2_1.csv
, mod2_2.csv
, mod3_3.csv
, etc.
The contents of mod1_1.csv
file look like this:
5.71 6.66 5.52 6.90
5.78 6.69 5.55 6.98
5.77 6.63 5.73 6.91
And mod1_2.csv
looks like this:
5.73 6.43 5.76 6.57
5.79 6.20 5.10 7.01
5.71 6.21 5.34 6.81
In RUN2, mod2_1.csv looks like this:
5.72 6.29 5.39 5.59
5.71 6.10 5.10 7.34
5.70 6.23 5.23 6.45
And mod2_2.csv looks like this:
5.72 6.29 5.39 5.69
5.71 6.10 5.10 7.32
5.70 6.23 5.23 6.21
My goal is to obtain the line with the smallest value of column 4 for each RUN* directory, and write that and the model which gave it and part of the path to a new .csv file. Right now, I have this code:
#!/bin/bash
resultfile="best_results_mlp_onelayer.txt"
for d in $(find . -type d -name 'RUN*' | sort);
do
find "$d" -type f -name 'mod*' -exec awk '{print $0, FILENAME}' {} \;|sort -k4 -g |head -1 >> "$resultfile"
done
This gives me:
5.73 6.43 5.76 6.57 ./RUN_1/mod1_2.csv
5.72 6.29 5.39 5.59 ./RUN_2/mod2_1.csv
But I would like a .csv file with these contents:
5.73 6.43 5.76 6.57 ./DIR1/DIR2/DIR3/RUN_1/mod1_2.csv
5.72 6.29 5.39 5.59 ./DIR1/DIR2/DIR3/RUN_2/mod2_1.csv
where my pwd is /DIRk/DIRm/DIRl/DIR1/DIR2/DIR3
EDIT:
Based on a reply, what I mean by 'k positions back' is:
Right now, my code gives me ./RUN_1/mod1_2.csv
as the last column value in the first row. To me, that is a pwd 'one position back', because it shows the directory where the file mod1_2.csv
is located. I would like the path '4 positions back'. That is, I would like ./DIR1/DIR2/DIR3/RUN_1/mod1_2.csv
. I said 'k' because that's a common placeholder, and I was hoping I could just substitute a number in there.