1

I am writing my first pip package, but I have trouble with relative paths. The package structure is as follows:

.
├── packname
│   ├── __init__.py
│   ├── packfile1.py
│   ├── packfile2.py
│   └── packfile3.py
│
├── datatoload
│   ├── toload1.pkl
│   ├── toload2.pkl
│   ├── toload3.pkl
│   └── toload4.pkl
│
└── requirements.txt

Some python files in the packname directory need to load data from files in the datatoload directory. I have some questions about managing package files and data.

Is it ok to have a separate folder for the data to load?

Since I want people to use my package, should I add some properties to my package (I read something about __file__ and __path__)?

Moreover, do you have any more advice about this?

Thank you :)

UPDATE A user in the comments told me that the folder needs to be inside the package folder, as follows:

.
├── packname
│   ├── __init__.py
│   ├── packfile1.py
│   ├── packfile2.py
│   │── packfile3.py
│   │
│   └─ datatoload
│      ├── toload1.pkl
│      ├── toload2.pkl
│      ├── toload3.pkl
│      └── toload4.pkl
│
└── requirements.txt

The most important question I want to ask is: how do I setup the relative path to be used inside the package? For example, if I want to load data saved in toload2.pkl from a function in packfile3.py, can I simply do

load('./datatoload/toload2.pkl')

Would this work when someone downloads my package (together with the datatoload folder)?

  • 1
    "*Is it ok to have a separate folder for the data to load?*" No, it must be inside the package to avoid polluting installation directory. "*…\__file__ and \__path__…*" No need, Python adds these variable automatically on import. – phd Feb 22 '22 at 11:23
  • Thank you for your comment! Reading your question I notice that I missed a crucial question about HOW to import with a relative path in a python package. I will update the question :) – Diego Stucchi Feb 22 '22 at 12:05
  • 1
    "*`load('./datatoload/toload2.pkl')` Would this work when someone downloads my package…?*? No because `./` means the current directory and the current directory for user could be anything. You need to calculate you package directory using `os.path.dirname(__file__)`; like this: https://stackoverflow.com/a/56843242/7976758 – phd Feb 22 '22 at 13:35
  • Thank you again @phd! Would you like to post your comments as an answer? I'd be glad to accept your answer. – Diego Stucchi Feb 22 '22 at 13:45

1 Answers1

2

Is it ok to have a separate folder for the data to load?

No, it must be inside the package to avoid polluting installation directory.

…__file__ and __path__…

No need, Python adds these variables automatically on import.

load('./datatoload/toload2.pkl') Would this work when someone downloads my package…?

No because ./ means the current directory and the current directory for user could be anything. You need to calculate you package directory using os.path.dirname(__file__). See https://stackoverflow.com/a/56843242/7976758/ for an example.

phd
  • 82,685
  • 13
  • 120
  • 165