2

I have a problem with deleting extension from my filename. I tried to use

os.path.splitext(checked_delivery)[0]

, but it delete only .gz from filename. I need to check if file has extension or it's a directory. I did it using this:

os.path.exists(delivery)

But another problem is, that I can't split it cause of data in it (YYYY.MM.DD). Should I use join() or it is something more attractive instead of tons of methods and ifs?

shad0w_wa1k3r
  • 12,955
  • 8
  • 67
  • 90
Alex
  • 1,221
  • 2
  • 26
  • 42
  • Are you checking if it's a directory? – Josh Lee Aug 17 '17 at 14:20
  • it could be a tar.gz file or already unpacked directory – Alex Aug 17 '17 at 14:23
  • Possible duplication for [What's the way to extract file extension from file name in Python?](https://stackoverflow.com/questions/16976192/whats-the-way-to-extract-file-extension-from-file-name-in-python) – manasouza Jul 05 '19 at 13:00

3 Answers3

1

I propose the following small function:

def strip_extension(fn: str, extensions=[".tar.bz2", ".tar.gz"]):
    for ext in extensions:
        if fn.endswith(ext):
            return fn[: -len(ext)]
    raise ValueError(f"Unexpected extension for filename: {fn}")

assert strip_extension("foo.tar.gz") == "foo"
moi
  • 1,835
  • 2
  • 18
  • 25
1

I propose a generic solution to remove the file extension from the string using the pathlib module. Using the os to manage the paths is not that convenient nowadays, IMO.

import pathlib


def remove_extention(path: pathlib.PosixPath) -> path.PosixPath:
    suffixes = ''.join(path.suffixes)
    return pathlib.Path(str(path).replace(suffixes, ''))
Alex
  • 1,221
  • 2
  • 26
  • 42
-2

If you know that the extension is always going to be .tar.gz, you can still use split:

In [1]: fname = 'RANDOM_FILE-2017.06.07.tar.gz'

In [2]: '.'.join(fname.split('.')[:-2])
Out[2]: 'RANDOM_FILE-2017.06.07'

From the docstring for os.path.splitext:

"Extension is everything from the last dot to the end, ignoring leading dots. "

In the case of gzipped tarballs, this makes sense anyway, as the file 'FILE.tar.gz' is a gzipped version of the 'FILE.tar', which is presumably a tarball made from file 'FILE'

This is why you would need to use something other than os.path.splitext for this, if what you need is the original filename, without .tar

greg_data
  • 2,247
  • 13
  • 20
  • So check if it's a dir and if not -> your second line. Thank you. – Alex Aug 17 '17 at 14:25
  • 1
    Yup, `os.path.isdir(...)`, then use the above from my answer for removing the extension. – greg_data Aug 17 '17 at 14:28
  • 3
    why not `fname.replace(".tar.gz")` ? unlikely to be in the middle of a name, and that doesn't kill other dots in filenames like YYYY.MM.DD if it's a dir – Jean-François Fabre Aug 17 '17 at 14:45
  • just in case if any whats to do it in *nix `basename -s .tar.gz filename.tar.gz` – sid-m Aug 17 '17 at 14:49
  • 1
    If you know the extension is _always_ going to be '.tar.gz' you can just slice without the last `len('.tar.gz')` characters. Maybe checking with `endswith()` if the assumption is right, before slicing. – BlackJack Aug 17 '17 at 15:42
  • @Jean-FrançoisFabre, only `fname,replace(".tar.gz", "")` works. Replace method: `str.replace(old, new[, max])` – Alex Aug 17 '17 at 17:29