How to get the nth recent file in the nth last modified subdirectory using pipes

Question

I'm doing an exercise for OS exam. It requires to get the 3rd recent file of the 2nd last modified sub-directory inside current directory. Then I have to print its lines in reverse order. I can not use tac command. The text suggest to use (other than awk and sed): head, tails, wc.

I've succeded getting filename of the requested file (but in a too complex way I think). Now I have to print it in reverse. I think I can use this awk solution https://stackoverflow.com/a/744093/11614625.

This is how I'm getting the filename:

ls -t | head | awk '{system("test -d \"" $0 "\" && echo \"" $0 "\"")}' | awk 'NR==2 {system("ls \"" $0 "\" | head")}' | awk 'NR==1'

How can I do better? And what if 3rd directory or 2nd file doesn't exists?

Also, using `system()` inside of awk is a shell scripting anti pattern. Figure out something like `.... | awk 'NR==2' | while read dir ; do if [ -d $dir ] ; then echo found dir=$dir"; fi ; done | ....` . Good luck. — shellter, Jun 07 '19 at 15:47
It can be done much more succinctly in sed, using the hold space. Is that enough of a hint? — Beta, Jun 07 '19 at 22:34

Ed Morton · Answer 1 · 2019-06-08T00:02:53.110

See https://mywiki.wooledge.org/ParsingLs and awk '{system("test -d \"" $0 "\" && echo \"" $0 "\"")}' is calling shell to call awk to call system to call shell to call test which is clearly a worse approach than just having shell call test in the first place if you were going to do that. Also, any solution that reads the whole file into memory (as any sed or a naive awk solution would) will fail for large files as they'll exceed available memory.

Unfortunately this is how to do what you want robustly:

dir="$(find . -mindepth 1 -maxdepth 1 -type d -printf '%T+\t%p\0' |
       sort -rz |
       awk -v RS='\0' 'NR==2{sub(/[^\t]+\t/,""); print; exit}')" &&
file="$(find "$dir" -mindepth 1 -maxdepth 1 -type f -printf '%T+\t%p\0' |
       sort -z |
       awk -v RS='\0' 'NR==3{sub(/[^\t]+\t/,""); print; exit}')" &&
cat -n "$file" | sort -rn | cut -f2-

If any of the commands in any of the pipes fail then the error message from the command that failed will be printed and then no other command will execute and the overall exit status will be the failure one from that failing command.

I used cat | sort | cut rather than awk or sed to print the file in reverse because awk (unless you write demand paging in it) or sed would have to read the whole file into memory at once and so would fail for very large files while sort is designed to handle large files by using paging with tmp files as necessary and only keeping parts of the file in memory at a time so it's limited only by how much free disk space you have on your device.

The above requires GNU tools to provide/handle NUL line-endings - if you don't have those then change \0 to \n in the find command, remove the z from sort options, and remove -v RS='\0' from the awk command and be aware that the result will only work if your directory or file names don't contain newlines.

I wondered how you were going to pull that rabbit out the hat. Why can't you just `tac` file at the end instead of `cat -n .. | sort -rn | cut -f2-`? — David C. Rankin, Jun 08 '19 at 00:27
@DavidC.Rankin because the OP said in her question `I can not use tac command` — Ed Morton, Jun 08 '19 at 00:28

How to get the nth recent file in the nth last modified subdirectory using pipes

1 Answers1