So, I am working on a GitHub repo that I plan to publish as a Python package. I accidentally added and committed a couple of large data files (for testing) to the repository. I then removed the file using a later commit.
My main concern has to do with the fact that this data file is very very large.
My question is two-fold:
- If someone clones my repo now, will the size of the download include the large data file, because git would want that file on the local system in case someone tries to revert to an older commit?
- When I publish this as a Python package, will the installer also similarly download this large data file (irrespective of if it is referred to or not)?
I suspect the answer to 1) is Yes, but to 2) is No, but I am not sure. If either answer is Yes, how do I fix this?