0

I'm trying to extract substring after the last period (dot). examples below.

  • echo "filename..txt" should return "txt"
  • echo "filename.txt." should return ""
  • echo "filename" should return ""
  • echo "filename.xml" should return "xml"

I tried below. but works only if the character(dot) exists once. But my filename may have (dot) for 0 or more times.

echo "filename.txt" | cut -d "." -f2
fedorqui
  • 275,237
  • 103
  • 548
  • 598
sunshine737
  • 61
  • 2
  • 3
  • 4
  • So you want the file extension? You're using bash? Nothing easier than [Extract filename and extension in Bash](http://stackoverflow.com/questions/965053/extract-filename-and-extension-in-bash). (this wouldn't cover the `"filename" -> ""` case). – StephenKing Dec 30 '15 at 22:47

4 Answers4

5

Let's use awk!

awk -F"." '{print (NF>1)? $NF : ""}' file

This sets field separator to . and prints the last one. But if there is none, it prints an empty string.

Test

$ cat file
filename..txt
filename.txt.
filename
filename.xml
$ awk -F"." '{print (NF>1)? $NF : ""}' file
txt


xml
Community
  • 1
  • 1
fedorqui
  • 275,237
  • 103
  • 548
  • 598
2

One can make this portable (so it's not Linux-only), avoiding an ERE dependency, with the following:

$ sed -ne 's/.*\.//p' <<< "file..txt"
txt
$ sed -ne 's/.*\.//p' <<< "file.txt."

$ sed -ne 's/.*\.//p' <<< "file"
$ sed -ne 's/.*\.//p' <<< "file.xml"
xml

Note that for testing purposes, I'm using a "here-string" in bash. If your shell is not bash, use whatever your shell uses to feed data to sed.

The important bit here is the use of sed's -n option, which tells it not to print anything by default, combined with the substitute command's explicit p flag, which tells sed to print only upon a successful substitution, which obviously requires a dot to be included in the pattern.

With this solution, the difference between "file.txt." and "file" is that the former returns the input line replaced with null (so you may still get a newline depending on your usage), whereas the latter returns nothing, as sed is not instructed to print, as no . is included in the input. The end result may well be the same, of course:

$ printf "#%s#\n" $(sed -ne 's/.*\.//p' <<< "file.txt.")
##
$ printf "#%s#\n" $(sed -ne 's/.*\.//p' <<< "file")
##
ghoti
  • 45,319
  • 8
  • 65
  • 104
0

Simple to do with awk:

awk -F"." '{ print $NF }'

What this does: With dot as a delimiter, extract the last field from the input.

Vampiro
  • 335
  • 4
  • 15
  • 2
    Fails the no-extension case. But that's easily fixable by adding `/\./` (if empty output, no blank line, is acceptable and with `/\./{print $NF;next} {print ""}` otherwise). – Etan Reisner Dec 31 '15 at 00:12
0

Use sed in 2 steps: first remove string without a dot and than remove up to the last dot:

sed -e 's/^[^.]*$//' -e 's/.*\.//'

Test:

for s in file.txt.. file.txt. file.txt filename file.xml; do
   echo "$s -> $(echo "$s" | sed -e 's/^[^.]*$//' -e 's/.*\.//')"
done

Testresult:

file.txt.. ->
file.txt. ->
file.txt -> txt
filename ->
file.xml -> xml

Actually the answer of @ghoti is roughly the same, just a bit shorter (better). This solution can be used by other readers who wants to do something like this in another language.

Walter A
  • 19,067
  • 2
  • 23
  • 43