I'm trying to compose a sed command to remove all trailing extensions from file names that have more than one in sequence separated by '.' eg:
/a/b/c.gz -> /a/b/c
/a/b/c.tar.gz -> /a/b/c rather than /a/b/c.tar
Notice that only the filename should be truncated; dots on parent directories are to be preserved.
/a/b.c/d.tar.gz -> /a/b.c/d
never
/a/b.c/d.tar, /a/b or /a/b/d
Therefore simply remove everything after the first '.' is not a solution.
I have a command that works OK as long as there is at least one '/' in the file name (or path rather). I'm not sure how to enhance in order to also cover single element (only filename) cases:
sed 's/^\(.*\/[^.\/]*\)[^\/]*$/\1/' list_of_filepaths.txt \
> output_filepaths_wo_extensions.txt
So, the command above does the right thing with:
./abc.tar.gz, parent/.../abc.tar.gz, /abc.tar.gz
It does not work for single element (only filename) cases:
abc.tar.gz
Of course, this is not surprising since it isn't matching the slash '/' anywhere.
Although adding a second sed command to deal with the '/' free case is trivial, I would like to cover all cases with a single command as it seems to me that it should be possible.
For example, I was hopping that this one would work, but it does not work for either:
sed 's/^\(.*?\/\)?\([^.\/]*\)[^\/]*$/\1\2/'
So, in this attempt of mine, the first (additional) group would capture the optional '/' containing prefix preceding the last '/'. In case of a slash free file-path that group would simply be empty.