0
~/D/s/b/h_adv_ML (master±) ▶︎︎ git ls-files --error-unmatch *.csv
test_rmse.csv
train_rmse.csv
error: pathspec 'genome-scores.csv' did not match any file(s) known to git.
error: pathspec 'genome-tags.csv' did not match any file(s) known to git.
error: pathspec 'links.csv' did not match any file(s) known to git.
error: pathspec 'movies.csv' did not match any file(s) known to git.
error: pathspec 'ratings.csv' did not match any file(s) known to git.
error: pathspec 'tags.csv' did not match any file(s) known to git.
Did you forget to 'git add'?
~/D/s/b/h_adv_ML (master±) ▶︎︎ git commit -m " clean3"
[master b31185a]  clean3
 2 files changed, 1 insertion(+), 465566 deletions(-)
 delete mode 100644 tags.csv
~/D/s/b/h_adv_ML (master) ▶︎︎ git push origin master
Counting objects: 36, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (35/35), done.
Writing objects: 100% (36/36), 192.26 MiB | 2.17 MiB/s, done.
Total 36 (delta 12), reused 0 (delta 0)
remote: Resolving deltas: 100% (12/12), completed with 1 local object.
remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com.
remote: error: Trace: 70338ad9481eef83938b427ed955775e
remote: error: See http://git.io/iEPt8g for more information.
remote: error: File genome-scores.csv is 308.56 MB; this exceeds GitHub's file size limit of 100.00 MB
remote: error: File ratings.csv is 508.73 MB; this exceeds GitHub's file size limit of 100.00 MB
To https://github.com/<...>/h_adv_ML.git
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'https://github.com/<...>/h_adv_ML.git'

It tries to push them when they are not tracked! What am I missing here? I am not sure why it is trying to push "genome-scores.csv" and "ratings.csv" when they are untracked.

$ git status
On branch master
Your branch is ahead of 'origin/master' by 8 commits.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean
 $ tree
.
├── README.md
├── README.txt
├── check.py
├── check_old.py
├── code.py
├── genome-scores.csv
├── genome-tags.csv
├── links.csv
├── movies.csv
├── old_code.py
├── ratings.csv
├── read_py.py
├── tags.csv
├── test_rmse.csv
└── train_rmse.csv
Abhishek Bhatia
  • 9,404
  • 26
  • 87
  • 142
  • The first two are tracked or am i wrong? What is the output of git status? – Michele Federici Feb 21 '18 at 16:14
  • @m__ error is not due to the first two files. – Abhishek Bhatia Feb 21 '18 at 16:19
  • Were they added in a previous commit and removed? – Kevin Hoerr Feb 21 '18 at 16:37
  • @KevinHoerr yeah, how to resolve it. – Abhishek Bhatia Feb 21 '18 at 16:57
  • It won't exactly be easy to resolve. Basically, you have to make sure that those files never existed in the repository history. You'll have to `git reset` the commits that created and removed the files. It depends on what else were in those commits though. I don't have much experience with this type of problem, sorry. – Kevin Hoerr Feb 21 '18 at 17:25
  • Github has an article on this exact issue, actually: https://help.github.com/articles/removing-files-from-a-repository-s-history/ not sure if this would really help though since it wasn't created in the last commit. Maybe you can try to squash the commits in a different branch. – Kevin Hoerr Feb 21 '18 at 17:27
  • 1
    See @KevinHoerr's answer; but mostly, don't think of `git push` as pushing *files*, because that's not what it does. It pushes *commits*. Commits contain files, so the files do get there, but the push is based on the commits. Commits don't have tracked vs untracked files; commits just have files, or don't have them. Your existing commits do have the files, and that's the problem. – torek Feb 21 '18 at 18:09
  • Possible duplicate of [How to remove/delete a large file from commit history in Git repository?](https://stackoverflow.com/questions/2100907/how-to-remove-delete-a-large-file-from-commit-history-in-git-repository) – phd Feb 21 '18 at 20:05

1 Answers1

2

I've made a couple of comments already, but I think I've come up with a rather basic solution - you need to "squash" the commits since the last time you've pushed.

  1. Identify the last commit that's on the remote. You can do this with git log --oneline and look for the commit with origin/master next to it. Then copy the commit's ID, which is the hex string of ~7 characters at the beginning of the line.

  2. git reset COMMIT (e.g. git reset abcdef0) will take out all of the commits and 'unstage' them. This means that all changes made since that commit (including that commit) will be taken out of Git, but kept in your filesystem.

  3. Identify any changes made in the reset commits that you want to keep and recommit them. It should tell you the 'Unstaged changes after reset' after the previous command.

  4. You should be able to push the committed changes to the Github repository.


Slightly more programmatic version:

# get sha1 revision of origin/master and reset to that
git reset $(git rev-parse origin/master)

# git add via patch, so you can moderate what you add
git add -p

git commit
git push
Kevin Hoerr
  • 2,319
  • 1
  • 10
  • 21