31

Expected inputs and outputs:

a                 -> a
a.txt             -> a
archive.tar.gz    -> archive
directory/file    -> file
d.x.y.z/f.a.b.c   -> f
logs/date.log.txt -> date # Mine!

Here's my implementation that feels dirty to me:

>>> from pathlib import Path
>>> example_path = Path("August 08 2015, 01'37'30.log.txt")
>>> example_path.stem
"August 08 2015, 01'37'30.log"
>>> example_path.suffixes
['.log', '.txt']
>>> suffixes_length = sum(map(len, example_path.suffixes))
>>> true_stem = example_path.name[:-suffixes_length]
>>> true_stem
"August 08 2015, 01'37'30"

Because it breaks on Paths without suffixes:

>>> ns_path = Path("no_suffix")
>>> sl = sum(map(len, ns_path.suffixes))
>>> ns_path.name[:-sl]
''

So I need to check if the Path has a suffix first:

>>> def get_true_stem(path: Path):
...     if path.suffix:
...         sl = sum(map(len, path.suffixes))
...         return path.name[:-sl]
...     else:
...         return path.stem
...
>>>
>>> get_true_stem(example_path)
"August 08, 2015, 01'37'30"
>>> get_true_stem(ns_path)
"no_suffix"

And this is my current use case:

>>> file_date = datetime.strptime(true_stem, "%B %d %Y, %H'%M'%S")
>>> file_date
datetime.datetime(2015, 8, 8, 1, 37, 30)
>>> new_dest = format(file_date, "%Y-%m-%dT%H:%M:%S%z") + ".log" # ISO-8601
>>> shutil.move(str(example_path), new_dest)

Thanks.

Navith
  • 929
  • 1
  • 9
  • 15
  • 2
    It's frustating as `Path` actually knows about these suffixes — they are present in its `suffixes` member. – P-Gn Feb 18 '20 at 13:28
  • i think if instead of `ns_path.name[:-sl]`, you use `ns_path.name[:len(ns_path.name)-sl]`, your solution works even if there's no suffix – starwarswii Apr 05 '23 at 20:44

7 Answers7

33

You could just .split it:

>>> Path('logs/date.log.txt').stem.split('.')[0]
'date'

os.path works just as well:

>>> os.path.basename('logs/date.log.txt').split('.')[0]
'date'

It passes all of the tests:

In [11]: all(Path(k).stem.split('.')[0] == v for k, v in {
   ....:     'a': 'a',
   ....:     'a.txt': 'a',
   ....:     'archive.tar.gz': 'archive',
   ....:     'directory/file': 'file',
   ....:     'd.x.y.z/f.a.b.c': 'f',
   ....:     'logs/date.log.txt': 'date'
   ....: }.items())
Out[11]: True
Cristian Ciupitu
  • 20,270
  • 7
  • 50
  • 76
Blender
  • 289,723
  • 53
  • 439
  • 496
  • 1
    [`.partition`](https://docs.python.org/3/library/stdtypes.html#str.partition) could also be used, e.g. `Path('logs/date.log.txt').stem.partition('.')[0]` or `os.path.basename('logs/date.log.txt').partition('.')[0]`. It might even be [faster for some use cases](https://stackoverflow.com/q/47898526/12892). – Cristian Ciupitu Sep 11 '19 at 19:16
  • 2
    Note that this does not work for hidden files: `Path('logs/.date.log.txt').stem.split('.')[0] == ''` – pabouk - Ukraine stay strong Nov 17 '21 at 13:58
  • 1
    a small optimization: use `.split('.', 1)[0]`. no need to split the whole string into parts when we're only interested in the first part. and then you could also use `.name` instead of `.stem`, which may read clearer – starwarswii Apr 05 '23 at 17:49
8

How about a while loop method, where you keep taking .stem until the path has no suffixes remaining , Example -

from pathlib import Path
example_path = Path("August 08 2015, 01'37'30.log.txt")
example_path_stem = example_path.stem
while example_path.suffixes:
    example_path_stem = example_path.stem
    example_path = Path(example_path_stem)

Please note, the while loop exits the loop when example_path.suffixes returns an empty list (As empty list are False like in boolean context) .


Example/Demo -

>>> from pathlib import Path
>>> example_path = Path("August 08 2015, 01'37'30.log.txt")
>>> example_path_stem = example_path.stem
>>> while example_path.suffixes:
...     example_path_stem = example_path.stem
...     example_path = Path(example_path_stem)
...
>>> example_path_stem
"August 08 2015, 01'37'30"

For your second input - no_suffix -

>>> example_path = Path("no_suffix")
>>> example_path_stem = example_path.stem
>>> while example_path.suffixes:
...     example_path_stem = example_path.stem
...     example_path = Path(example_path_stem)
...
>>> example_path_stem
'no_suffix'
Anand S Kumar
  • 88,551
  • 18
  • 188
  • 176
6

Why not go recursively?

from pathlib import Path

def true_stem(path):
   stem = Path(path).stem
   return stem if stem == path else true_stem(stem)

assert(true_stem('d.x.y.z/f.a.b.c') == 'f')
plankthom
  • 111
  • 1
  • 5
5

Here's another possible solution to the given problem:

from pathlib import Path

if __name__ == '__main__':
    dataset = [
        ('a', 'a'),
        ('a.txt', 'a'),
        ('archive.tar.gz', 'archive'),
        ('directory/file', 'file'),
        ('d.x.y.z/f.a.b.c', 'f'),
        ('logs/date.log.txt', 'date'),
    ]
    for path, stem in dataset:
        path = Path(path)
        assert path.name.replace("".join(path.suffixes), "") == stem
BPL
  • 9,632
  • 9
  • 59
  • 117
2
filepath = pathlib.Path("d.x.y.z/f.a.b.c")
basename = filepath.name.removesuffix("".join(filepath.suffixes))
evandrix
  • 6,041
  • 4
  • 27
  • 38
1

If you wanted to use pathlib uniquely, you could also use:

>>> Path('logs/date.log.txt').with_suffix('').stem
'date'

EDIT:

As pointed out in the comments this doesn't work if you have an extension with more than 2 suffixes. Although this doesn't sound very likely (and pathlib itself doesn't have a native way to deal with it), if you wanted to use pathlib uniquely, you could use:

>>> Path('logs/date.log.txt.foo').with_suffix('').with_suffix('').stem
'date'
buhtz
  • 10,774
  • 18
  • 76
  • 149
universvm
  • 353
  • 2
  • 11
  • 1
    Doesn't work if there are more than two suffixes: `Path('logs/date.log.txt.foo').with_suffix("").stem` = `'date.log'` – Mandera Sep 23 '20 at 15:19
  • You are correct. You'd have to add another `.with_suffix("")` which may make it clunky: ` >>> Path('logs/date.log.txt.foo').with_suffix("").with_suffix("").stem 'date' ` – universvm Sep 23 '20 at 16:24
  • "doesn't work if you have an extension with more than 2 suffixes" Seems like the whole point of the original question.. – ssp Jun 23 '22 at 20:50
1

Another approach uses pattern matching:

import re
from pathlib import Path
all(re.search('[.]|',Path(k).name) for k,v in {
   'a': 'a',
   'a.txt': 'a',
   'archive.tar.gz': 'archive',
   'directory/file': 'file',
   'd.x.y.z/f.a.b.c': 'f',
   'logs/date.log.txt': 'date'
   }.items())

the pattern '[.]' may be used if all your paths have at least one suffix

Jim Robinson
  • 189
  • 1
  • 10