1

Team, I asked reverse of this question and it was answered here. how to exclude printing in bash a line that has no sub directory path compared to other similar paths?

in this question I want to do reverse of it.

I want reverse of this below.

cat file | sort | tac |awk '{sub("/[^/]+$","")} index(prev,$0"/") != 1{print} {prev=$0}'

where file actually represents the output of a find command:

find src/services/ -type f -name 'BUILD.bazel'

and so contains:

$ cat file
src/services/sam-agent/BUILD.bazel
src/services/sam-agent/auth/BUILD.bazel
src/services/sam-agent/certs/BUILD.bazel
src/services/sam-agent/server/BUILD.bazel
src/services/jam-controller/BUILD.bazel
src/services/jam-controller/api/BUILD.bazel
src/services/jam-controller/client/BUILD.bazel
src/services/wam-controller/api/BUILD.bazel
src/services/wam-controller/api/client/BUILD.bazel
src/services/wam-controller/api/server/BUILD.bazel

The desired command when run on the above input should output

src/services/sam-agent
src/services/jam-controller
src/services/wam-controller

i.e. expected prints only parents and no sub dirs:

src/services/sam-agent #< should be printed
src/services/sam-agent/auth
src/services/sam-agent/certs
src/services/sam-agent/server
src/services/jam-controller #< should be printed
src/services/jam-controller/api
src/services/jam-controller/client
src/services/wam-controller/api #< should obe printed
src/services/wam-controller/api/client
src/services/wam-controller/api/server

tried answer 1 and got below. still shows sub dir and not just src/services/rams

cat /tmp/t.log | sort | tac | awk '{if ($0 in seen) print; else {sub("/[^/]+$", ""); seen[$0]}}'

src/services/rams/store/postgres
src/services/rams

sample t.log

src/services/rams
src/services/rams/integration
src/services/rams/keys
src/services/rams/migrations
src/services/rams/mocks
src/services/rams/server
src/services/rams/smoke
src/services/rams/store
src/services/rams/store/postgres
src/services/rams/store/postgres/mocks
src/services/rams/tools/keyrotate
src/services/rams/tools/secretmanager
src/services/rams/vault
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
AhmFM
  • 1,552
  • 3
  • 23
  • 53
  • 1
    Your previous question and this one were both missing the actual input you have (the output of your `find` command) so I edited this question to add it. Feel free to edit it again if you think it could be explained better. – Ed Morton Apr 19 '23 at 11:52
  • Shouldn't `src/services/wam-controller/` in your expected output be `src/services/wam-controller/api`? – Ed Morton Apr 19 '23 at 11:55
  • Why would the output for `t.log` only be `src/services/rams` instead of `src/services/rams/store`, `src/services/rams`, and `src/services`? – Ed Morton Apr 19 '23 at 11:56

4 Answers4

1

Would you please try the following:

cat file | sort | awk 'index($0, prev"/") != 1 {print; prev = $0}'

Replace cat file with find src/services/ -type f -name 'BUILD.bazel'.

  • the input is sorted so the parent path appears first.
  • index($0, prev"/") returns 1 if prev"/" is a substring of $0, meaning $0 is a child of prev. If not, print $0 as a parent.
  • the trailing / in prev"/" anchors the string to avoid a pseudo-match.
  • the variable prev is updated only if $0 is a parent.
tshiono
  • 21,248
  • 2
  • 14
  • 22
  • thanks tried but got subdir again. am posting your solution result log in my question. – AhmFM Apr 19 '23 at 02:33
  • Thank you for the feedback. I may not have well considered the case of multiple depths of the subdirectories. Would you please try the fixed code? BR. – tshiono Apr 19 '23 at 04:32
0

Just change != to == in your existing command:

$ cat file | sort | tac | awk '{sub("/[^/]+$","")} index(prev,$0"/") == 1{print} {prev=$0}'
src/services/wam-controller/api
src/services/sam-agent
src/services/jam-controller

$ cat t.log | sort | tac | awk '{sub("/[^/]+$","")} index(prev,$0"/") == 1{print} {prev=$0}'
src/services/rams/store
src/services/rams
src/services
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
0

Given this directory tree:

tree ../src
../src
└── services
    ├── jam-controller
    │   ├── BUILD.bazel
    │   ├── api
    │   │   └── BUILD.bazel
    │   └── client
    │       └── BUILD.bazel
    ├── sam-agent
    │   ├── BUILD.bazel
    │   ├── auth
    │   │   └── BUILD.bazel
    │   ├── certs
    │   │   └── BUILD.bazel
    │   └── server
    │       └── BUILD.bazel
    └── wam-controller
        └── api
            ├── BUILD.bazel
            ├── client
            │   └── BUILD.bazel
            └── server
                └── BUILD.bazel

13 directories, 10 files

You can get the directories that have the file BUILD.bazel and also have a subdirectory with Bash directly (no need for find):

#!/bin/bash

shopt -s globstar

for fn in src/**/BUILD.bazel; do
    parent="$(dirname "$fn")"
    ls "$parent"/*/ &>/dev/null && echo "$parent"
done   

Prints:

src/services/jam-controller
src/services/sam-agent
src/services/wam-controller/api
dawg
  • 98,345
  • 23
  • 131
  • 206
0

If you use GNU find you can try:

$ find src/services/* -type f -name 'BUILD.bazel' -printf '%H\n' | sort | uniq
src/services/jam-controller
src/services/sam-agent
src/services/wam-controller

Add the -mindepth 1 option if there can be a src/services/BUILD.bazel file and you want to skip it:

find src/services/* -mindepth 1 -type f -name 'BUILD.bazel' -printf '%H\n' | sort | uniq
Renaud Pacalet
  • 25,260
  • 3
  • 34
  • 51