0

Team, my situation is like I have to pull paths that are with subdirectorys and exclude printing the parent directory path alone.

so example output

src/services/sam-agent #< should not be printed
src/services/sam-agent/auth
src/services/sam-agent/certs
src/services/sam-agent/server
src/services/jam-controller #< should not be printed
src/services/jam-controller/api
src/services/jam-controller/client
src/services/wam-controller/api #< should not be printed
src/services/wam-controller/api/client
src/services/wam-controller/api/server


expected

src/services/sam-agent/auth
src/services/sam-agent/certs
src/services/sam-agent/server
src/services/jam-controller/api
src/services/jam-controller/client

I got above by below expression not sure what modifications I can make to get above output.

find src/services/ -type f -name 'BUILD.bazel' | sed -r 's|/[^/]+$||' |sort |uniq|grep -e '*-controller'

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
AhmFM
  • 1,552
  • 3
  • 23
  • 53
  • Does this answer your question? [Use GNU find to show only the leaf directories](https://stackoverflow.com/questions/4269798/use-gnu-find-to-show-only-the-leaf-directories) – yut23 Apr 18 '23 at 00:24
  • Ah, I just noticed you're searching for BUILD.bazel files and getting their directories, so that probably won't work. – yut23 Apr 18 '23 at 00:29
  • This looks to me more like a general programming question, i.e. to find an algorithm. Here a starting point: Read the input line by line. Make sure that in your loop you always have the current line **and** the previous line in a variable. If the previous line is a prefix of the current line, go to the next iteration. Otherwise print the previous line. – user1934428 Apr 18 '23 at 05:59

4 Answers4

2

find has a flexible syntax which allows you to articulate many constraints on what you want to have printed.

Your requirements are unclear, but reading from your expected results, I would guess something like

find src/services/ -mindepth 2 type f -path '*-controller/*' -name 'BUILD.bazel' -exec dirname {} \;

If you have GNU find (as you often will on e.g. Linux) you can replace the last -exec part with

... -printf '%h\n'

which will use an internal function from GNU find instead of run an external process on each found path, which is generally a useful improvement.

(You could reduce the number of external processes with a more complex -exec but I'm guessing you don't really need to optimize this.)

tripleee
  • 175,061
  • 34
  • 275
  • 318
1

Like this:

num=3
<INPUT> | awk -v num=$num -F/ 'NF>num' file

-v allow to pass variables to awk as key=value

LESS='+/^ +-v' man awk

-v var=val
--assign var=val
Assign the value val to the variable var, before execution of the program begins. Such variable values are available to the BEGIN rule of an AWK program.

Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
  • thanks super quick. well, if parent has 4 sub dirs and not 3 always? hope i am not confusing. – AhmFM Apr 18 '23 at 00:22
  • 1
    I answered exactly to your needs. If you have another question, feel free to create another thread. You can modify 3 with the level you need. Usually, the way to thanks here is by accepting/voting up answer(s). – Gilles Quénot Apr 18 '23 at 00:32
  • indeed your answer works but will wait if someone can provide a dynamic answer that takes care of all cases. so will upvote for now. – AhmFM Apr 18 '23 at 00:34
  • am evaluating .. will update, unable to find on google what `vmax` is? or if you can explain your params would be helpful. – AhmFM Apr 18 '23 at 00:41
  • i am unable to vote your answer. i accepted below solution as it seems dynamic to me. let me know how can i give you 1 vote? – AhmFM Apr 19 '23 at 00:11
  • no when i do upvote. it shows me -1 and then back to 0 – AhmFM Apr 19 '23 at 00:19
  • i mentioned above right? when i upvote, it shows me -1 then back to 0. – AhmFM Apr 19 '23 at 00:41
  • btw can you answer this? https://stackoverflow.com/questions/76049937/how-to-exclude-printing-in-bash-a-line-that-has-only-parent-path-and-exclude-all but it has to be similar to used command and just reverse of it. – AhmFM Apr 19 '23 at 00:42
1

This might be what you want:

$ cat file | sort | tac | awk '{sub("/[^/]+$","")} index(prev,$0"/") != 1{print} {prev=$0}'
src/services/wam-controller/api/server
src/services/wam-controller/api/client
src/services/sam-agent/server
src/services/sam-agent/certs
src/services/sam-agent/auth
src/services/jam-controller/client
src/services/jam-controller/api

Replace cat file with find src/services/ -type f -name 'BUILD.bazel' to give this input:

$ cat file
src/services/sam-agent/BUILD.bazel
src/services/sam-agent/auth/BUILD.bazel
src/services/sam-agent/certs/BUILD.bazel
src/services/sam-agent/server/BUILD.bazel
src/services/jam-controller/BUILD.bazel
src/services/jam-controller/api/BUILD.bazel
src/services/jam-controller/client/BUILD.bazel
src/services/wam-controller/api/BUILD.bazel
src/services/wam-controller/api/client/BUILD.bazel
src/services/wam-controller/api/server/BUILD.bazel

The above assumes that none of your directory/file names contain newlines.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • what is an example of newline? i did not get it. – AhmFM Apr 18 '23 at 22:57
  • `> $'foo\nbar'` will create a file whose name contains a newline. – Ed Morton Apr 18 '23 at 23:00
  • can you please explain all your params? – AhmFM Apr 19 '23 at 00:16
  • also, can you please help me get the reverse of this output? like parent dir should be printed instead? – AhmFM Apr 19 '23 at 00:27
  • asked here https://stackoverflow.com/questions/76049937/how-to-exclude-printing-in-bash-a-line-that-has-only-parent-path-and-exclude-all – AhmFM Apr 19 '23 at 00:37
  • Regarding [can you please explain all your params?](https://stackoverflow.com/questions/76040169/how-to-exclude-printing-in-bash-a-line-that-has-no-sub-directory-path-compared-t/76044394?noredirect=1#comment134124703_76044394) - I think the code is pretty self-explanatory if you execute it one step at a time from left to right to see what each command does. Is there any particular part of it that isn't clear? – Ed Morton Apr 19 '23 at 11:38
1

Here's how you can extract the directory paths that are not part of a longer one from a list of file paths with awk:

find some/path -type f |
awk -F / '
    {
        nf = (type == "d" ? NF : NF-1)
        dir = sep = ""
        for (i = 1; i <= nf; i++) {
            dir = dir sep $i
            sep = FS
            ++arr[dir]
        }
        --arr[dir]
    }
    END {
        for (dir in arr)
            if (arr[dir] == 0)
                print dir
    }
'

note: by default it processes file paths but you can make it accept directory paths as input by adding the parameter -v type=d to awk

The solution is 100% generic so it should yield the expected result when processing your find command:

find src/services/ -type f -name 'BUILD.bazel' | awk ... | sort
src/services/jam-controller/api
src/services/jam-controller/client
src/services/sam-agent/auth
src/services/sam-agent/certs
src/services/sam-agent/server
src/services/wam-controller/api/client
src/services/wam-controller/api/server

Now, as @triplee suggested, if you want to exclude the wam-controller directory from the output then you can use other predicates in the find command:

find src/services/ ! -path '*/wam-controller/*' -type f -name 'BUILD.bazel' | ...
Fravadona
  • 13,917
  • 1
  • 23
  • 35