-1

Possible Duplicate:
How to extract a file extension in PHP?
Get the file extension (basename?)

trying tot learn from other people´s code , I see a lot of methods to strip a filename from it´s extension, but most of the methods seems too localized as they assume a certain condition. for example :

This will assume only 3-character extension (like .txt, .jpg, .pdf)

substr($fileName, 0, -4);

or

 substr($fileName, 0, strrpos($fileName, '.')); 

But this can cause problems on file names like .jpeg, .tiff .html . or only 2 like .jsOr .pl

(browsing this list shows some file names can have only 1 character, and some as many as 10 (!) )

some other methods i have seen rely on the point (.)

for example :

  return key(explode(“.”, $filename));

Can cause problems with filenames like 20121029.my.file.name.txt.jpg

same here :

return preg_replace('/\.[^.]*$/', '', $filename);

some people use the pathinfo($file) and / or basename() (is it ALWAYS safe ?? )

basename($filename);

and many many other methods ..

so my question has several parts :

  • what is the best way to "strip" a file extension ? (with the point)

  • what is the best way to "get" the file extension (without the point) and / or check it

  • will php own functions (basename) will recognize ALL extensions regardless of how exotic they might be or how the filename is constructed ?

  • what if any influence does the OS has on the matter ? (win, linux, unix...)

  • all those small sub-questions , which i would like to have an answer to can be summed-up in an overall single question : Is there a bullet-proof , overall, always-work, fail-proof , best-practice , über_function that will work under all and any condition ??

EDIT I - another file extension list

Community
  • 1
  • 1
Obmerk Kronen
  • 15,619
  • 16
  • 66
  • 105
  • + on a non PHP-specific level, or in other words, in a language that doesn't have any of these tools, are there any file extensions with a `.` in them? If not, why not just check for the last `.` to find the file extension? – ಠ_ಠ Oct 29 '12 at 00:33
  • 4
    Just use `pathinfo($filename, PATHINFO_EXTENSION );` better than any other implementation you might want to attempt – Baba Oct 29 '12 at 00:36
  • `basename()` doesn't strip the extension, btw. And `key(explode(..))` should be `reset(explode(..))`, or better yet `strtok()`. You didn't eleborate on the need to handle multiple file extensions `file.html.gz` for instance, so there's no generic answer here. – mario Oct 29 '12 at 00:36
  • 1
    @zdhickman - yes the most famous one is `tar.gz` – Obmerk Kronen Oct 29 '12 at 00:37
  • @mario - I did mantioned files like `20121029.my.file.name.txt.jpg`.. – Obmerk Kronen Oct 29 '12 at 00:38
  • Mentioned, but not elaborated. Do you want both only verified file extensions stripped, only the last, or any presumed ones, or which? – mario Oct 29 '12 at 00:41
  • @mario - I am sorry, i thought it is understood. I would want to "seperate" the parts related to the name (user given) and the extention (software given) - for lack of better words . for example in a file 20121021.my.file.name.txt.tar.gz - `tar.gz`wwould be the extention.. – Obmerk Kronen Oct 29 '12 at 00:45
  • That doesn't clear it up really. `.txt` is a file extension too. Why strip only `.tar.gz`? You're looking for constrained pairs of extensions then? How about my `backup.home.tar.xz.aes256` then? The searched algorithm then had to keep all extensions, cause the last is unkown and you would strip 2 at most only? (Just trying to explain that it's woefully difficult to get this stuff right. Sometimes the simpler solutions are the better ones.) – mario Oct 29 '12 at 00:48
  • What @mario says. A file's extension is the part after the last `.`, period. Everything else is meaningless. That is valid even for `.tar.gz` which signifies a tar file inside a gz file - for all programmatic purposes, it is a gz file until it's unzipped. Only *then* it becomes a tar file. – Pekka Oct 29 '12 at 00:48
  • @hakre (&others) - I have seen those questions , but they do not really clarify the point in my question (such as OS related behavior, or the exotic file extensions..) – Obmerk Kronen Oct 29 '12 at 00:50
  • @ObmerkKronen: The file-extension is only a convention. You will not get much safety with it. It's not clear what you ask for therefore. That is the point everybody in comments tried to point out. You don't clarify the exact parameters either in your question. With something that unclear, how could it be ever answered? – hakre Oct 29 '12 at 00:52

1 Answers1

4

Quoting from the duplicate question's top answer:

$ext = pathinfo($filename, PATHINFO_EXTENSION);

this is the best available way to go. It's provided by the operating system, and the best you can do. I know of no cases where it doesn't work.

One exception would be a file extension that contains a .. But no sane person would introduce a file extension like that, because it would break everywhere plus it would break the implicit convention.

for example in a file 20121021.my.file.name.txt.tar.gz - tar.gz would be the extention..

Nope, it's much simpler - and maybe that is the root of your worries. The extension of 20121021.my.file.name.txt.tar.gz is .gz. It is a gzipped .gz file for all intents and purposes. Only when you unzip it, it becomes a .tar file. Until then, the .tar in the file name is meaningless and serves only as information for the gunzip tool. There is no file extension named .tar.gz.

That said, detecting the file extension will not help you determine whether a file is actually of the type it claims. But I'm sure you know that, just putting this here for future readers.

Pekka
  • 442,112
  • 142
  • 972
  • 1,088
  • well, this question is getting complicated :-) . let´s consider a scenario where I want a user to upload a file (`my-cat.tar.gz`), and then use the NAME (`my-cat`) of the file ALONE in order to give a TITLE for it´s content in a blog post (example) . in this case, the "technical" explanation will not help much, because even if it is true that technically a `.tar.gz` is a `.gz` file. - my result for the post title will be `my-cat.tar` and not `my-cat` as wanted. anyhow, to save further complication your answer will do. THANKS a lot . even with all the confusion I managed to learn a lot ! – Obmerk Kronen Oct 29 '12 at 01:19
  • @Obmerk ah, I see your point. Hmm, I think in that case, you'll have to let the user specify the final name, as there is no way really to recognize which part belongs to the extension, and which part belongs to the file name. You're welcome! – Pekka Oct 29 '12 at 07:33