3

I am a beginner in Python. I would like to understand the following function, which returns the extension of that file:

def get_extn(filename):
    return filename[filename.rfind('.'):][1:]

I do not understand why there are brackets in the rfind function [] but not () and why there is : and [1:] before the bracket. I appreciate an explanation.

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555

3 Answers3

4

What you see here is a function that has two times slicing syntax. For objects that support slicing syntax, one can write:

object[f:t]

with f and t indices. You then get a subsequence that starts by f and ends with t (t is exclusive). If f or t are not provided, that usually means that we slice from the beginning, or to the end.

The function in your question is a bit cryptical, and actually is equivalent to:

def get_extn(filename):
    f = filename.rfind('.')
    filename = filename[f:]
    return filename[1:]

So first we obtain the index of the last dot, then we construct a substring that starts from f, and finally we construct a substring from that substring that starts at index 1 (thus removing the first character which is a '.').

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
2

You need to start with understanding python syntax.

Square brackets access elements in an array, and the parentheses are for calling a function. rfind is a function, for which you are entering the argument '.', to find the period in the filename. the brackets are to retrieve the elements in the array - both the elements in the filename, hence filename[], and the elements from the array.

The colons, :, are for slices within the array. [:] means the entire array, [1:] means elements after the first. See: Explain slice notation

David Manheim
  • 2,553
  • 2
  • 27
  • 42
1

I suggest using the os.path module to deal with file names and paths.

Example:

import os.path

for path in ('/tmp/file.txt', 'file.doc', 'file', 'file.a.b.c'):
    basename, extension=os.path.splitext(path)
    print("path: '{}', base: '{}' extension '{}'".format(path,basename,extension))

Prints:

path: '/tmp/file.txt', base: '/tmp/file' extension '.txt'
path: 'file.doc', base: 'file' extension '.doc'
path: 'file', base: 'file' extension ''
path: 'file.a.b.c', base: 'file.a.b' extension '.c'
dawg
  • 98,345
  • 23
  • 131
  • 206