Getting last element of a path (different from #10124314 as basename falls over)

Question

I need to process a couple of thousand PDF files sorted alphabietically on their filename ideally from bash. So from my simple perspective I need to walk a tree of files, stripping off path as I go and then do various grepping, sorting etc

Having seen an answer to a similar question I've tried doing a

tim@MERLIN:~/Documents/Scanned$ basename `find ./ -print`

but that gets messed up by some directory names which have spaces in them - e.g. there is one called General Letters which acts like a chicken-bone in the works and results in

basename: extra operand ‘Letters’
Try 'basename --help' for more information.

I can't see a way to get find to strip out the pathname and I would prefer to use find given its plethora of options to filter on age, size etc. Nor can I see any way to get basename to cope gracefully with spaces in this context.

I considered using cut but I can't work out how to get cut to give me the last field by doing something like cut -d/ <whatever> I'm sure there must be an easy way to do it: some sort of in-line sed or awk script?

I don't particularly want the buggeration of writing a perl/Python script to do it for me as I know I should be able to do it from the command line.

So any simple tips or suggestions?

Updated/Solved

Many thanks to Cyrus the solution is

tim@MERLIN:~/Documents/Scanned$ find . -name *.pdf -printf '%f\n' | sort

Please update your question to provide a link to "answer #10124314". I think you mean [this question](http://stackoverflow.com/q/10124314/827263). — Keith Thompson, Jul 19 '14 at 18:56
Proper direction recursion along with `pushd` and `popd` is what you probably need here. Just my 2 cents. — konsolebox, Jul 19 '14 at 19:29

score 4 · Accepted Answer · answered Jul 19 '14 at 18:32

4

Try this:

find ./ -printf '%f\n'

%f: File's name with any leading directories removed (only the last element).

answered Jul 19 '14 at 18:32

Cyrus

84,225
14
89
153

Perfect. That's exactly what I was after. `find . -name *.pdf -printf '%f\n' | sort` gets me pretty close to where I need to be. – TimGJ Jul 19 '14 at 21:28

score 1 · Answer 2 · answered Jul 19 '14 at 18:27

1

Here is a working solution using awk:

find ./ | awk -F'/' '{ print $NF }';

It simply uses / as delimiter and prints the last value of the line.

Or with grep:

find ./ | grep -oE "[^/]+$"

answered Jul 19 '14 at 18:27

julienc

19,087
17
82
82

score 0 · Answer 3 · answered Jul 19 '14 at 18:32

0

Through sed,

find ./ | sed 's/.*\/\(.*\)$/\1/g'

answered Jul 19 '14 at 18:32

Avinash Raj

172,303
28
230
274

score 0 · Answer 4 · answered Jul 19 '14 at 19:20

0

If you want get a list of pathnames (recursively) but want sort them by filenames (not by path names) you can use:

find . -printf '%f|%p\n' | sort -k 1 -t'|' | cut -d'|' -f2-

You need a GNU find for this. (Linux ok, not default in OS X).

Without the GNU find, you can do the above with:

find . -print | sed 's:\(.*\)/\(.*\)$:\2\|\1/\2:' | sort -k 1 -t'|' | cut -d'|' -f2-

(Assuming there is no \n in the filenames)

answered Jul 19 '14 at 19:20

clt60

62,119
17
107
194

In this instance the path name doesn't actually matter: I am looking for "holes" in the numbering scheme of the documents (as I suspect some have been deleted or stored somewhere bizarre or never saved). But thanks for the steer. – TimGJ Jul 19 '14 at 21:34

Getting last element of a path (different from #10124314 as basename falls over)

Updated/Solved

4 Answers4