30

I have a Python library that, in addition to regular Python modules, has some data files that need to go in /usr/local/lib/python2.7/dist-package/mylibrary.

Unfortunately, I have been unable to convince setup.py to actually install the data files there. Note that this behaviour is under install - not sdist.

Here is a slightly redacted version of setup.py

module_list = list_of_files

setup(name         ='Modules',
      version      ='1.33.7',
      description  ='My Sweet Module',
      author       ='PN',
      author_email ='email',
      url          ='url',
      packages     = ['my_module'],

# I tried this. It got installed in /usr/my_module. Not ok.

      # data_files   = [ ("my_module",  ["my_module/data1",
      #                                  "my_module/data2"])]

# This doesn't install it at all.
      package_data = {"my_module" : ["my_module/data1",
                                     "my_module/data2"] }
     )

This is in Python 2.7 (will have to run in 2.6 eventually), and will have to run on some Ubuntu between 10.04 and 12+. Developing it right now on 12.04.

Paul Nathan
  • 39,638
  • 28
  • 112
  • 212
  • Minimal runnable published working example at: https://stackoverflow.com/questions/3596979/manifest-in-ignored-on-python-setup-py-install-no-data-files-installed/60735402#60735402 – Ciro Santilli OurBigBook.com Mar 18 '20 at 07:54

3 Answers3

24

UPD: package_data accepts dict in format {'package': ['list', 'of?', 'globs*']}, so to make it work, one should specify shell globs relative to package dir, not the file paths relative to the distribution root.

data_files has a different meaning, and, in general, one should avoid using this parameter.

With setuptools you only need include_package_data=True, but data files should be under version control system, known to setuptools (by default it recognizes only CVS and SVN, install setuptools-git or setuptools-hg if you use git or hg...)


with setuptools you can:

- in MANIFEST.im:

    include my_module/data*

- in setup.py:

    setup(
        ...
        include_package_data = True,
        ...
    )
Mikhail T.
  • 3,043
  • 3
  • 29
  • 46
podshumok
  • 1,649
  • 16
  • 20
  • hm. this is a bit short. Can you elaborate on what the respective actions cause? And do I have to do both things or either of them? – Frederick Nord Oct 08 '14 at 16:58
  • This is the method that worked independent of platform. When I used the accepted answer it worked on Mac OS, but on a Linux VM the data files got copied to strange places. – Kaushik Ghose Oct 10 '14 at 17:25
  • When I included `include my_module/data*` inside of the MAINFEST and had `data_files` defined in setup.py, the install did not copy the data files. Removing the data_files definition in setup.py while leaving the include in the MANIFEST resulted in expected behavior. – dmmfll Oct 31 '15 at 11:14
  • 4
    include_package_data = True, has generally not worked for me. – Matt Joyce Jan 30 '18 at 16:15
  • -1: You actually *do* need to use the `data_files` parameter, especially if you're installing from a remote, such as `pip3 install git+https://somesite.com/path/to/repo.git`. Otherwise, the data files *fail* to get included as part of the installation and your program will break (like mine did). So, you need *both* `data_files` and `include_package_data` in your `setup` function. – code_dredd Sep 17 '18 at 07:34
  • @code_dredd this is strange: include_package_data works for me in scenario like yours. What are the versions of pip and setuptools that you use? Maybe it is not git+https but some other kind of remote? pip should clone -d1 the repo to a temporary folder and run setup.py there. Check if running from local folder (`pip install .`) will work properly. – podshumok Sep 17 '18 at 19:57
  • @podshumok I'm using `pip 9.0.1` for Python3 on Ubuntu 18.04 LTS (latest at this time). The repository URL is indeed `git+https://...`; I created it (in GitLab) and the remote is correct. It does clone to a `/tmp/...` dir. The thing is, `python3 setup.py build sdist && pip3 install .` would work just fine without having to use `data_files`, but not from a remote -which took some time to figure out. I've read that there're different behaviors depending on the kinds of builds that you do (e.g. `sdist` vs `bdist`), though I don't have a link to such a post at this time or know if it's related. – code_dredd Sep 17 '18 at 21:48
  • @podshumok BTW, to clarify, the `git+` prefix is really necessary if you want `pip` to build and install directly from a URL that points to a Git repository, rather than, say, an existing package from https://pypi.org/ – code_dredd Sep 17 '18 at 23:18
  • the OP has a tag `python2.7`, and you guys are discussing w.r.t. `pip3` – Ciasto piekarz Jan 11 '20 at 13:16
6

http://docs.python.org/distutils/setupscript.html#installing-additional-files

If directory is a relative path, it is interpreted relative to the installation prefix (Python’s sys.prefix for pure-Python packages, sys.exec_prefix for packages that contain extension modules).

This will probably do it:

data_files   = [ ("my_module",  ["local/lib/python2.7/dist-package/my_module/data1",
                                 "local/lib/python2.7/dist-package/my_module/data2"])]

Or just use join to add the prefix:

data_dir = os.path.join(sys.prefix, "local/lib/python2.7/dist-package/my_module")
data_files   = [ ("my_module",  [os.path.join(data_dir, "data1"),
                                 os.path.join(data_dir, "data2")])]
monkut
  • 42,176
  • 24
  • 124
  • 155
  • 2
    Hmmm. Reluctant to hardcode the path in, but that might serve for now. – Paul Nathan Jun 28 '12 at 00:56
  • 4
    I used distutils.sysconfig.get_python_lib() + "path" and used that as the key. – Paul Nathan Jun 29 '12 at 21:02
  • 6
    This is not the proper way to do it. `data_files` is for files you want to put under /usr (e.g. icons, .desktop files, etc.). If you want to include data along with your Python module you use `package_data` along with the `include_package_data=True` flag. – Grumbel Feb 17 '18 at 09:05
  • I think `sysconfig` attribute is not present in distutils for python2.7 and so is `include_package_data=True` doesnt work – Ciasto piekarz Jan 11 '20 at 13:49
  • Various web pages claim that `data_files` locates the installed content somewhere under `sys.prefix`, `sys.exec_prefix`, or `site.USER_BASE`. I tried it and it dropped them under `/usr`, which is none of those. – Jonathan Ross Feb 12 '21 at 20:47
0

The following solution worked fine for me. You should have MANIFEST.in file where setup.py is located.

Add the following code to the manifest file

recursive-include mypackage *.json *.md # can be extended with more extensions or file names. 

Another solution is adding the following code to the MANIFEST.in file.

graft mypackage # will copy the entire package including non-python files. 
global-exclude __pyache__ *.txt # list files you dont want to include here. 

Now, when you do pip install all the necessary files will be included.

Hope this helps.

UPDATE: Make sure that you also have include_package_data=True in the setup file

shmsi
  • 958
  • 10
  • 9