As of Python 3.9 you can use importlib.resources
from the standard library. This module uses Python's import machinery to resolve the paths of data files as though they were modules inside a package.
Now from your __main__.py
script or any part of your code with access to the module data
, you can access files inside that package like so:
- To read
text.txt
to memory as a string:
txt_file = importlib.resources.files("data").joinpath("text.txt").read_text(encoding="utf-8")
- To read
binary.dat
to memory as bytes:
bin_file = importlib.resources.files("data").joinpath("binary.dat").read_bytes()
path = importlib.resources.files("data").joinpath("text.txt")
with path.open("rt", encoding="utf-8") as file:
lines = file.readlines()
# As streams:
textio_stream = importlib.resources.files("data").joinpath("text.txt").open("rt", encoding="utf-8")
bytesio_stream = importlib.resources.files("data").joinpath("binary.dat").open("rb")
- If something requires an actual real file on the filesystem, or you simply want to wrap zipapp compatibility over existing code (e.g.
with open()
) without having to modify it:
# Old, incompatible with zipfiles.
file_path = "data/text.txt"
with open(file_path, "rt", encoding="utf-8") as file:
lines = file.readlines()
# New, compatible with zipfiles.
file_path = importlib.resources.files("data").joinpath("text.txt")
# If file is inside a zipfile, unzips it in a temporary file, then
# destroys it once the context manager closes. Otherwise, reads the file normally.
with importlib.resources.as_file(file_path) as path:
with open(path, "rt", encoding="utf-8") as file:
lines = file.readlines()
# Since it is a context manager, you can even store it like this:
file_path = importlib.resources.files("data").joinpath("text.txt")
real_path = importlib.resources.as_file(file_path)
with real_path as path:
with open(path, "rt", encoding="utf-8") as file:
lines = file.readlines()
The Traversable
objects returned from importlib.resources
functions can be mixed with Path
objects using as_posix
, since joinpath requires posix separators:
file_path = pathlib.Path("subdirectory", "text.txt")
txt_file = importlib.resources.files("data").joinpath(file_path.as_posix()).read_text(encoding="utf-8")
You can use slashes to grow a Traversable
, just like pathlib.Path
objects:
resources_root = importlib.resources.files("data")
text_path = resources_root / "text.txt"
bin_file = (resources_root / "subdirectory" / "bin.dat").read_bytes()
You can also import the data
package like any other package, and use the module object directly. Subpackages are also supported. The only Python files inside the data
tree are the __init__.py
files of each subpackage:
# __main__.py
import importlib.resources
import data.config
import data.models.b
# Load binary file `file.dat` from `data.models.b`.
# Subpackages are being used as subdirectories.
bin_file = importlib.resources.files(data.models.b).joinpath("file.dat").read_bytes()
...
You technically only need to make your resource root directory be a package. For max brevity:
# __main__.py
from importlib.resources import files
data = files("data") # Resources root.
# In this example, `models` and `b` are regular directories:
bin_file = (data / "models" / "b" / "file.dat").read_bytes()
...
Note that importlib.resources
and zipfiles in general support reading only and you will get an exception if you try to write to any file-like object returned from the above functions. It might technically be possible to support modifying data files inside zips but this is way out of scope. If you want to write files, just open a file in the filesystem as normal.
Now your data files have become file-system agnostic and your program should work via zipapp and normal invocation just the same.