3

The problem

I accidentally added my datasets to my commit. When I pushed the commit it gave me the standard file size error (dataset files are over 100MB). I reverted the previous commit using git revert, and only added my ipython notebook and some image files.

Now when I push my commit it still tires to push the dataset files as well

Using git diff --stat origin/master I found the files to be pushed :

agconti@agconti-Inspiron-5520:~/my_dev/github/US_Dolltar_Vehicle_Currencny$ git diff --stat origin/master

 .ipynb_checkpoints/US_Dollar_Vehicle_Currency-checkpoint.ipynb | 1972 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 CountryNumbers_indexConti.xlsx                                 |  Bin 0 -> 22762 bytes
 Italian Trade/US_Dollar_Vehicle_Currency.ipynb                 | 1972 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 Italian Trade/images/3_year_exchange_rate_volitlity.png        |  Bin 0 -> 11808 bytes
 Italian Trade/images/Currency_usage_breakdown_top20.png        |  Bin 0 -> 404666 bytes
 Italian Trade/images/Currency_usage_observations_top20.png     |  Bin 0 -> 274964 bytes
 Italian Trade/images/Currency_usage_trade_value_top20.png      |  Bin 0 -> 211274 bytes
 Italian Trade/images/Exchange_rate_volitlity_top20.png         |  Bin 0 -> 345899 bytes
 Italian Trade/images/prop_xm_top20.png                         |  Bin 0 -> 258254 bytes
 Italian Trade/images/rate_derive_activity_top20.png            |  Bin 0 -> 214196 bytes
 README.md                                                      |    2 +-
 US_Dollar_Vehicle_Currency.ipynb                               |  809 --------------------------------
 images/3_year_exchange_rate_volitlity.png                      |  Bin 11808 -> 0 bytes
 images/Currency_usage_breakdown_top20.png                      |  Bin 404666 -> 0 bytes
 images/Currency_usage_observations_top20.png                   |  Bin 292532 -> 0 bytes
 images/Currency_usage_trade_value_top20.png                    |  Bin 224008 -> 0 bytes
 images/Exchange_rate_volitlity_top20.png                       |  Bin 361868 -> 0 bytes
 images/exporter_economic_strength_top20.png                    |  Bin 0 -> 166575 bytes
 images/exporter_trade_health_top20.png                         |  Bin 0 -> 277557 bytes
 images/prop_xm_top20.png                                       |  Bin 275777 -> 0 bytes
 images/rate_derive_activity_top20.png                          |  Bin 228728 -> 0 bytes
 libpeerconnection.log                                          |    0
 22 files changed, 3945 insertions(+), 810 deletions(-)

None are the dataset files. Even so they are still being pushed.

How can I get git to stop pushing the dataset files?

Heres a look at the error message:

Delta compression using up to 8 threads.
Compressing objects: 100% (27/27), done.
Writing objects: 100% (30/30), 279.15 MiB | 389 KiB/s, done.
Total 30 (delta 7), reused 3 (delta 0)
remote: Error code: c4fe7114933ad585dc5027c82caabdaa
remote: warning: Error GH413: Large files detected.
remote: warning: See http://git.io/iEPt8g for more information.
remote: error: File ForConti_AllItalianImportsVCP.raw is 987.19 MB; this exceeds GitHub's file size limit of 100 MB
remote: error: File italian_imports.csv is 1453.55 MB; this exceeds GitHub's file size limit of 100 MB
remote: error: File italian_imports_random_10percent.csv is 138.30 MB; this exceeds GitHub's file size limit of 100 MB
To https://github.com/agconti/US_Dollar_Vehicle_Currency
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'https://github.com/agconti/US_Dollar_Vehicle_Currency'

What I've tried

How can I see what I am about to push with git?

How to revert Git repository to a previous commit?

Community
  • 1
  • 1
agconti
  • 17,780
  • 15
  • 80
  • 114
  • Can you just delete your local repository and clone a new copy? – Daenyth Jul 18 '13 at 19:36
  • This isn't something that can fixed with a `.gitignore`? I'm trying to understand -- is it as if the reverted commit is still getting pushed along with the desired commit? – pattivacek Jul 18 '13 at 19:46
  • @patrickvacek yes, that could definitely work in the future. But right now after I removed all of the dataset files with `sudo git rm -r --cached datafile.file`, they are still being pushed. I need to unscramble this now so that i can push and do things like `.gitignore in the future. – agconti Jul 18 '13 at 19:48
  • @Daenyth I could but then I would loose all of my changes since the last push... which is a lot. – agconti Jul 18 '13 at 19:49

3 Answers3

6

Because you used git revert, those files are still referenced in your repository. It sounds like this was your basic sequence:

git add <a bunch of stuff including big files>
git commit                  # creates a commit including the unwanted files
# realize mistake
git revert HEAD             # this creates a new commit on top of the previous one,
                            # that simply undoes all the changes in the previous commit
# now trying to push

If you have no other commits in your master since (or in between) the bad commit and the reversion of the bad commit (in other words your graph looks like (where Z is the last good commit):

 .....Z--A--A' <- master

then the following should help:

 git reset --hard Z

This will reset your master branch back to Z, along with your working directory and index, which means it will look like the bad commit and the reversion never happened.

If there are other commits after the A point in the above, you'll need to use git rebase -i Z, and delete the two lines corresponding to A and A' instead.

If there are other changes in A that you want to keep (i.e. it wasn't just the big files that got committed there), you'll need to use the git rebase -i route, and mark A for edit. This will cause rebase to stop at the point it's just done the A commit, and then you can do this:

 git rm --cached <big files>                # remove the files from your index
 git commit --amend                         # fix up the last commit
 git rebase --continue                      # let rebase finish up the rest

Once you've done either of the above, git push should be able to proceed again...

twalberg
  • 59,951
  • 11
  • 89
  • 84
1

The revert command records an undo operation as a regular commit. The large files are still in the history. Perhaps you need to completely remove both the faulty commit and the reverted commit from the history. With

git rebase -i origin

an editor should open -- if you just delete the lines that indicate the faulty commit and the revert, you should be up and running again. See the Git book for a more detailed description of interactive rebase.

krlmlr
  • 25,056
  • 14
  • 120
  • 217
0
git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch italian_imports.csv' --prune-empty -- --all
Gui LeFlea
  • 795
  • 3
  • 12
  • 1
    **Note:** The above command will rebase the entire branch and it will require a force push to the remote in order to overwrite history on the remote. Ensure that you are up to date in your history before executing the above command! To force push to github with the `git push -f` flag set. This is one situation where a force push is required, but be careful about force pushing! – Gui LeFlea Jul 18 '13 at 21:12