Extract substring after a character

Question

I'm trying to extract substring after the last period (dot). examples below.

echo "filename..txt" should return "txt"
echo "filename.txt." should return ""
echo "filename" should return ""
echo "filename.xml" should return "xml"

I tried below. but works only if the character(dot) exists once. But my filename may have (dot) for 0 or more times.

echo "filename.txt" | cut -d "." -f2

So you want the file extension? You're using bash? Nothing easier than [Extract filename and extension in Bash](http://stackoverflow.com/questions/965053/extract-filename-and-extension-in-bash). (this wouldn't cover the `"filename" -> ""` case). — StephenKing, Dec 30 '15 at 22:47

score 5 · Answer 1 · edited Jun 20 '20 at 09:12

5

Let's use awk!

awk -F"." '{print (NF>1)? $NF : ""}' file

This sets field separator to . and prints the last one. But if there is none, it prints an empty string.

Test

$ cat file
filename..txt
filename.txt.
filename
filename.xml
$ awk -F"." '{print (NF>1)? $NF : ""}' file
txt


xml

edited Jun 20 '20 at 09:12

Community

1
1

answered Dec 31 '15 at 22:33

fedorqui

275,237
103
548
598

ghoti · Answer 2 · 2016-01-02T03:53:42.090

One can make this portable (so it's not Linux-only), avoiding an ERE dependency, with the following:

$ sed -ne 's/.*\.//p' <<< "file..txt"
txt
$ sed -ne 's/.*\.//p' <<< "file.txt."

$ sed -ne 's/.*\.//p' <<< "file"
$ sed -ne 's/.*\.//p' <<< "file.xml"
xml

Note that for testing purposes, I'm using a "here-string" in bash. If your shell is not bash, use whatever your shell uses to feed data to sed.

The important bit here is the use of sed's -n option, which tells it not to print anything by default, combined with the substitute command's explicit p flag, which tells sed to print only upon a successful substitution, which obviously requires a dot to be included in the pattern.

With this solution, the difference between "file.txt." and "file" is that the former returns the input line replaced with null (so you may still get a newline depending on your usage), whereas the latter returns nothing, as sed is not instructed to print, as no . is included in the input. The end result may well be the same, of course:

$ printf "#%s#\n" $(sed -ne 's/.*\.//p' <<< "file.txt.")
##
$ printf "#%s#\n" $(sed -ne 's/.*\.//p' <<< "file")
##

Vampiro · Answer 3 · 2015-12-31T00:12:15.860

0

Simple to do with awk:

awk -F"." '{ print $NF }'

What this does: With dot as a delimiter, extract the last field from the input.

edited Dec 31 '15 at 00:12

answered Dec 31 '15 at 00:05

Vampiro

335
4
15

2

Fails the no-extension case. But that's easily fixable by adding `/\./` (if empty output, no blank line, is acceptable and with `/\./{print $NF;next} {print ""}` otherwise). – Etan Reisner Dec 31 '15 at 00:12

Walter A · Answer 4 · 2015-12-31T15:12:01.740

Use sed in 2 steps: first remove string without a dot and than remove up to the last dot:

sed -e 's/^[^.]*$//' -e 's/.*\.//'

Test:

for s in file.txt.. file.txt. file.txt filename file.xml; do
   echo "$s -> $(echo "$s" | sed -e 's/^[^.]*$//' -e 's/.*\.//')"
done

Testresult:

file.txt.. ->
file.txt. ->
file.txt -> txt
filename ->
file.xml -> xml

Actually the answer of @ghoti is roughly the same, just a bit shorter (better). This solution can be used by other readers who wants to do something like this in another language.

Extract substring after a character

4 Answers4

Test