15

I would like to find the proper way to include files in a python sdist that are not tracked by git.

Context

The .mo files from my project are not tracked by git (like some other .txt files that need to be created at install time).

I have written a small function in setup.py to create them at install time, that I call in setup():

setup(
    .
    .
    .
    data_files=create_extra_files(),
    include_package_data=True,
    .
    .
    .
)

Note that they should belong to data_dir because the documentation says:

The data_files option can be used to specify additional files needed by the module distribution: configuration files, message catalogs, data files, anything which doesn’t fit in the previous categories.

So, this works well with python3 setup.py install (and bdist too). The .mo files are generated and stored at the right place.

But if I want it to work with sdist, then I must include them in MANIFEST.in (e.g. recursive-include mathmaker *.mo). Documentation says indeed:

Changed in version 3.1: All the files that match data_files will be added to the MANIFEST file if no template is provided. See Specifying the files to distribute.

(The link doesn't help much).

I am reluctant to include *.mo files in MANIFEST.in as they are not tracked by git. And check-manifest doesn't like this kind of situation, it complains about the fact that lists of files in version control and sdist do not match!

So, is there a way to fix this ugly situation?

Steps to reproduce the situation

Environment and project

To avoid polluting your environment, create and activate a dedicated virtual environment (python3.4+) in the directory of your choice:

$ pyvenv-3.4 v0
$ source v0/bin/activate
(v0) $

Reproduce following tree in a project0 directory:

.
├── .gitignore
├── MANIFEST.in
├── README.rst
├── setup.py
└── project0
    ├── __init__.py
    ├── main.py
    └── data
        └── dummy_versioned.po

Where README.rst, __init__.py and dummy_versioned.po are empty.

Content of the other files:

  • .gitignore:

    build/
    dist/
    *.egg-info
    project0/data/*.txt
    *~
    
  • MANIFEST.in:

    recursive-include project0 *.po
    recursive-include project0 *.txt
    
  • main.py:

    #!/usr/bin/env python3
    # -*- coding: utf-8 -*-
    
    
    def entry_point():
        with open('project0/data/a_file.txt', mode='rt') as f:
            print(f.read())
    
  • setup.py:

    #!/usr/bin/env python3
    # -*- coding: utf-8 -*-
    
    import platform
    from setuptools import setup, find_packages
    
    
    def create_files():
        txt_file_path = 'project0/data/a_file.txt'
        with open(txt_file_path, mode='w+') as f:
            f.write("Some dummy platform information: " + platform.platform())
        return [('project0/data', [txt_file_path])]
    
    
    setup(
        name='project0',
        version='0.0.1',
        author='J. Doe',
        author_email='j.doe@someprovider.com',
        url='http://myproject.url',
        packages=find_packages(),
        data_files=create_files(),
        include_package_data=True,
        entry_points={
            'console_scripts': ['myscript0 = project0.main:entry_point'],
        }
    )
    

Start a local git repo:

(v0) $ git init
(v0) $ git add .

Install check-manifest:

(v0) $ pip3 install check-manifest

Install and test

install works:

(v0) $ python3 setup.py install
.
.
.
copying project0/data/a_file.txt -> build/lib/project0/data
.
.
.
Finished processing dependencies for project0==0.0.1
(v0) $ myscript0 
Some dummy platform information: Linux-3.16.0-29-generic-x86_64-with-Ubuntu-14.04-trusty

If you rm project0/data/a_file.txt, then myscript0 doesn't work anymore, but reinstall it and it works again, as expected.

Building the sdist also includes a_file.txt:

(v0) $ python3 setup.py sdist
.
.
.
hard linking project0/data/a_file.txt -> project0-0.0.1/project0/data
.
.
.

Note that to have this file included in the sdist, it looks necessary (as explained in the "context" part below) to have recursive-include project0 *.txt in MANIFEST.in. Would you remove this line, python3 setup.py sdist woudln't mention a_file.txt any more (do not forget to remove any previous build/ or dist/ directories to observe this).

Conclusion

So, everything works as it is, but there is this discrepancy: a_file.txt is not tracked by git, but is included in MANIFEST.in.

check-manifest tells clearly:

lists of files in version control and sdist do not match!
missing from VCS:
  project0/data/a_file.txt

So, is there a proper way to handle this situation?

zezollo
  • 4,606
  • 5
  • 28
  • 59
  • OK, then I can remove it. But on Stack Overflow, won't I be said that I am talking about a case that has no problem? Woudln't I better edit this question on CodeReview to add a MCVE in it? (I think it's possible) – zezollo Jul 18 '16 at 14:42
  • If you edit in actual code that works, we can review it. – syb0rg Jul 18 '16 at 14:44
  • 1
    @zezollo We deal in real code, not an MCVE. Perhaps this question is more appropriate on Programmers SE, but please [follow the tour](http://programmers.stackexchange.com/tour) and read ["How do I ask a good question?"](http://programmers.stackexchange.com/help/how-to-ask), ["What topics can I ask about here?"](http://programmers.stackexchange.com/help/on-topic) and ["What types of questions should I avoid asking?"](http://programmers.stackexchange.com/help/dont-ask) before posting. – BCdotWEB Jul 18 '16 at 14:47
  • Can you generate `MANIFEST.in` on the _pre-commit_ hook?(I haven't used _setup.py_ before but I think this may solve the problem) – AmirHossein Jul 18 '16 at 18:22
  • @syb0rg I have added working code. It's a bit long to describe, but everything is here. Tell me if it's too long or still not on topic. – zezollo Jul 19 '16 at 08:00
  • @BCdotWEB Do you mean an MCVE can also not be real code? – zezollo Jul 19 '16 at 08:01
  • 1
    @zezollo ["we ask for your actual, real code and not a MCVE/boiled-down/'simplified version' of the code to be reviewed"](http://meta.codereview.stackexchange.com/a/6256/10582) – BCdotWEB Jul 19 '16 at 08:04
  • @BCdotWEB OK, well, I understand. Now, my question is exclusively about the code I have posted in the EDIT. No one can consider there's any *real code behind* it (except I've anonymized the posted part). Is there still a problem? – zezollo Jul 19 '16 at 08:33
  • 1
    It's not clear what code you are asking us to review — `setup.py`? Also, judging from your Conclusion section, the process doesn't work as intended, so it would still be off-topic for Code Review. – 200_success Jul 19 '16 at 18:21
  • @200_success The process itself does work, there's no bug (that's why I didn't think it would fit on Stack Overflow). There may be something to change in `setup.py` or `MANIFEST.in` or maybe add a missing file (plus a missing configuration?), this is what I don't know. I'd liked to know what is usually done in such a case (so belongs to "Does this code follow common best practices?"). Or do you mean there should be an algorithm and my question is more about configuration? Should I better delete it and re-ask on Stackoverflow? – zezollo Jul 20 '16 at 05:30
  • Is your question about how to suppress the "lists of files in version control and sdist do not match!" warning? – 200_success Jul 20 '16 at 05:37
  • @200_success Not far from that, but not exactly: maybe there is indeed a way to change the code that would result in suppressing this warning (while not including another bad practice). In this case, yes (do you mean, then it can be considered like a bug, or kind of?). But maybe this warning cannot be avoided: it's not good practice but as of today this situation cannot be handled better, so people follow this practice for lack of anything better OR maybe it's just okay this way and `check-manifest` outputs a warning it shouldn't. Sorry for all that and thanks for your replies! – zezollo Jul 20 '16 at 06:06
  • Since you are [asking a specific question rather than soliciting open-ended feedback](http://meta.codereview.stackexchange.com/questions/5777/a-guide-to-code-review-for-stack-overflow-users), this is a question for Stack Overflow rather than Code Review. "How do I suppress this warning?", "Is this a bug?", or "Is there a proper way to …?" are all specific questions that would be on-topic for Stack Overflow. – 200_success Jul 20 '16 at 09:17
  • @200_success alright, many thanks for all this! And sorry to have caused all these discussions! – zezollo Jul 20 '16 at 09:25

1 Answers1

2

As far as I get your problem you would like to add files to be distributed with the git repository but you don't want to keep track of their changes.

This can be done by this four simple steps:

Step 0: First ensure you the content inside the path/a_file.txt file matches with the content you want to distribute. As far as I know it can't be empty, so if you simply want this file to exist add a newline/space character to it.

Step 1: Add the file(s) to git using git add path/a_file.txt

Step 2: Commit the files (git commit path/a_file.txt)

Step 3: Update git's index and tell git it should ignore further changes on the files git update-index --assume-unchanged path/a_file.txt

If you ever want to do some changes to this file which should again be tracked, you can simply use the --no-assume-unchanged flag to set it active in git's index and then commit the changes.

Note that the creation of a .gitignore file which tells git to ignore the files (on all machines that clone the repository) and using git add --force path/a_file.txt won't work since git will (force) add it to the index and also keep track of the changes.

0xpentix
  • 732
  • 9
  • 22