Dealing with file when both working in interpreter and excuting a Python script in a terminal

Question

Each time I develop stuff in Python, I get annoyed by the fact that I have to switch between these two statement in order for my scripts to work both when I import script in a interpreted (e.g. Spyder on Ubuntu or even directly in a Python console) and in a console, when I 'launch' my scripts root@machine# python script.py:

import os
some_input_data_path = os.path.join(os.path.dirname(os.path.dirname(__file__),'input.csv')

works when the script is launch as an executable, but not in a interpreter.

import os
some_input_data_path = os.path.join(os.path.dirname(os.getcwd()),'input.csv')

works when the script is run in a interpreter, but not when it is launched as an executable.

I have set up a convenience try: block at the beginning of each of my scripts files to set __file__ as so:

import os

try:
    __file__
except NameError:
    __file__ = os.path.join(os.getcwd(), 'test.py')
    print("Warning: script is not run as a module. "
          "Setting '__file__' to: {}".format(__file__))
else:
    pass

I wonder if there are good practices of if there is some other (better) things that I can do to work (without having to manually switch something) both within my interpreter (mainly to develop stuff), and when executing the scripts in a terminal (mainly when they are used in production)?

Use case

Using this file:
$ cat script.py

import os

some_input_data_path = os.path.join(os.path.dirname(os.getcwd()), 'input.csv')

print(some_input_data_path)

when I execute this in Spyder I got this printed:
'/home/username/scriptdir/input.csv'
which is fine.

If I execute this script in bash:

user@machine:/home/username/scriptdir$ python script.py
'/home/username/scriptdir/input.csv'

but if I cd ..:

user@machine:/home/username$ python scriptdir/script.py
'/home/username/input.csv' # <- this is obviously no more where the csv input data file is.

do you maybe have an example of how you are using `__file__` subsequently? maybe there could be a use-case specific "best practice". note that `os.getcwd()` and the way you are setting `__file__` might not necessarily get you what you want: suppose you executed a script in a subfolder (such as `python foo/bar.py`), then `os.getcwd()` will give you the parent directory and not `./foo`). — sim, Jan 10 '21 at 12:40
Yes, and in that case, the script is no more able to figure out the right location of some input data files (as shown in my edit). — swiss_knight, Jan 10 '21 at 12:46
If the usecase is about data inputs, then I would suggest having a lightweight configuration (see https://stackoverflow.com/questions/6198372/most-pythonic-way-to-provide-global-configuration-variables-in-config-py?noredirect=1&lq=1 for some discussion on related best practices) that defines a data directory. — sim, Jan 11 '21 at 08:17

bavcol · Answer 1 · 2023-01-11T08:19:52.653

The problem might be that os.path.dirname(__file__) returns an absolute path in interpreter but returns a relative path in other situations (e.g. Windows cmd: 'python script.py').

Using os.path.abspath(os.path.dirname(__file__)) instead of os.path.dirname(__file__) fixed stuff for me (executing from Windows cmd), maybe this fix is also applicable in your case.

I also felt the urge to add this answer, because this might be the fix for search engine users which end up here based on the title of this post (like me).

score -1 · Answer 2 · answered Jan 10 '21 at 12:42

I think I understand what you're trying to do (and if I didn't please comment below).

If you're asking what is a good practice, I'd suggest the following:

Try to avoid module level variables.
Write some initializer function that accepts a path and sets all dependent objects in your module to your path. This way, your module is generic.
Next, when you "double-click" to execute, there should a separate "main" module. This main module calls your module and initializes paths with os.path.dirname(__file__).
When you run it in an REPL (python console), you would be working with your module (not main). Import it, and call the same initialization method with your os.getcwd().

Example:

File main.py

from yourlib.filename import process
process(__file__)

File yourlib/filename.py

..
..
def process(path):
    # Whatever you want to do.

When running in the python console:

>>> from yourlib.filename import process
>>> process(os.getcwd())

Dealing with __file__ when both working in interpreter and excuting a Python script in a terminal

Use case

2 Answers2

Dealing with file when both working in interpreter and excuting a Python script in a terminal