10

I'm having trouble including data files in my setup.py script. My package is setup as follows:

my_package/
    setup.py
    MANIFEST.in

    my_package/
        __init__.py
        access_data.py

        data_files/
            my_data_file.csv

I want to include the my_data_file.csv file when installing so that it can be read by access_data.py. To do so I used the package_data keyword in setuptools:

setup(...,
      packages=['my_package'],
      package_data={'my_package': ['./my_package/data_files/my_data_file.csv']},
      include_package_data=True
      )

I also included the file in MANIFEST.in:

recursive-include my_package/data_files *

setup.py seems to run fine and doesn't throw any errors. However, when I import the package I get a file not found error because my_data_file.csv is missing. I have tried referencing other stack overflow questions (particularly this one) but can't figure out what I'm doing wrong. How can I get setup.py to include the necessary data files?

user144153
  • 829
  • 1
  • 12
  • 28
  • Are you on a case-insensitive filesystem? `manifest.in` really should be `MANIFEST.in`. But that doesn't matter anyway as `MANIFEST.in` is used for source distribution (sdist). – phd Jul 17 '17 at 23:00
  • First think in debugging is to split The Big Problem into many smaller ones. Let's do it step by step. Step number 1: check that you distribution (sdist, egg or wheel) really contains `my_data_file.csv`; if not — fix `setup.py` to include it. Step number 2: check that `my_data_file.csv` is installed. Step 3 — debug why you cannot access the file even if it's in place (wrong path to the file? permissions?) – phd Jul 17 '17 at 23:04
  • your line should be `package_data={'my_package': ['data_files/my_data_file.csv']},` (under the `my_package` package, there's no `my_package` directory) – anthony sottile Jul 18 '17 at 00:46

1 Answers1

12

If it is listed in setup.py's package_data (correctly) you shouldn't need to include it in MANIFEST.in (as it will be included automatically)

In your case, the error is with your package_data line, the paths are relative to the namespace's root

In your case it should be:

package_data={'my_package': ['data_files/my_data_file.csv']},

Also note that the key in package data is the dotted module path (it's not super relevant for this toy case however).

anthony sottile
  • 61,815
  • 15
  • 148
  • 207
  • 4
    For those who did not know at once what the dotted name exactly is: `Packages are a way of structuring Python’s module namespace by using “dotted module names”. For example, the module name A.B designates a submodule named B in a package named A.` ([source link](https://docs.python.org/3/tutorial/modules.html)) – jake77 Mar 22 '18 at 06:14
  • How do you include all in a folder? – WJA Nov 16 '19 at 14:00
  • @JohnAndrews it supports globs, so use a `*` -- [for example](https://github.com/pre-commit/pre-commit/blob/0fd4a2ea38df4f276e43397e21ecb3fcbc96d122/setup.cfg#L48) – anthony sottile Nov 16 '19 at 17:40