-1

My repo includes a mix of file type including .csv. When I do git add . and then git status, I see:

git status
On branch master
Your branch is ahead of 'origin/master' by 2 commits.
  (use "git push" to publish your local commits)

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
    modified:   .ipynb_checkpoints/dpm-checkpoint.ipynb
    new file:   .ipynb_checkpoints/dp-checkpoint.ipynb
    modified:   App Data/dfr.csv

The csv files are rather large and rejected by git.

git push origin master
Enumerating objects: 852, done.
Counting objects: 100% (852/852), done.
Delta compression using up to 8 threads
Compressing objects: 100% (839/839), done.
Writing objects: 100% (842/842), 373.87 MiB | 7.10 MiB/s, done.
Total 842 (delta 26), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (26/26), completed with 5 local objects.
remote: warning: File App Data/dfr.csv is 95.38 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB
remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com.
To https://github.com/ks/SP.git
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'https://github.com/ks/SP.git'

How do I either compress the csv files using git and then push them or skip csv file types altogether?

kms
  • 1,810
  • 1
  • 41
  • 92
  • 2
    Does this answer your question? [gitignore all files of extension in directory](https://stackoverflow.com/questions/10712555/gitignore-all-files-of-extension-in-directory) – TonyArra Nov 20 '22 at 01:30
  • 2
    An alternative option would be [Git LFS](https://docs.github.com/en/repositories/working-with-files/managing-large-files/configuring-git-large-file-storage) to allow commiting the big files – TonyArra Nov 20 '22 at 01:31
  • @TonyArra I created a `.gitignore` file and added `**\*.csv`. but it still throws an error. – kms Nov 21 '22 at 04:55

1 Answers1

1

Normaly, we don't put file data on github.

In your case, the first, you can use .gitignore (a file) to exclusive files that we dont want git track(here is data file)

the second, because you add file data to staging are, so you have to remove it from staging are/index/cached by command

git rm --cached <path-to-file>

then, you put in .gitignore file that you just created above

finally

git add .
git push origin master
Manh Do
  • 111
  • 8
  • I created a `.gitignore` file with `*.csv`, but it still stages the csvs. The file is in the root directory of the git repo. – kms Nov 21 '22 at 06:00
  • ok, you can show me result of `git status` after you run `git rm --cached App Data/dfr.csv` or `git restore --staged App Data/dfr.csv` – Manh Do Nov 21 '22 at 06:05
  • I have several `.csv` files that are staged, so I'd have to run `git rm --cached App Data/dfr.csv` for all of them. – kms Nov 21 '22 at 06:09
  • about rule `.gitignore`, you can reference [here](https://www.atlassian.com/git/tutorials/saving-changes/gitignore) – Manh Do Nov 21 '22 at 06:24
  • Not sure, if I can run a command for each file, there are literally 100+ `csv` files. – kms Nov 21 '22 at 06:27
  • what do you mean? – Manh Do Nov 21 '22 at 06:30
  • `git rm -cached App Data/dfr.csv` is one of 100+ `csv` files and I'd have to run the command for each file. – kms Nov 21 '22 at 06:33
  • 1
    you can run mutiple files, `git rm -cached App Data/*.csv` – Manh Do Nov 21 '22 at 06:35
  • keep getting `fatal: pathspec 'App' did not match any files` when I run `git rm --cached App Data/*.csv` – kms Nov 21 '22 at 06:56
  • why does the name folder have white space, you can rename `App Data` to `AppData`. this problem relate to path file – Manh Do Nov 21 '22 at 07:00
  • yea, I tried renaming and removing the white space, but since the files are already tracked and staged, it doesn't recognize the change. – kms Nov 21 '22 at 07:26
  • i dont think so, if the file is tracked, when you rename, git will complain – Manh Do Nov 21 '22 at 07:28
  • I did rename `App Data` to `AppData` and then run the `git rm --` command. I get get `fatal: pathspec 'AppData/dfr.csv' did not match any files`, when I run `git status`, I still see `App Data/dfr.csv`. – kms Nov 21 '22 at 07:51
  • ok, Have you tried with `./AppData/dfr.csv`? – Manh Do Nov 21 '22 at 07:53
  • yes, no luck. I tried one of the files with the path and `./AppData/*.csv` – kms Nov 21 '22 at 08:11
  • I'm sorry, I know very little about path, but about skip file with large size when commit on gihub as i mentioned above, i think it ok. You can ignore the existing project, and try another project with a path without white spaces or only file data in root project. If there is any problem, please tell me – Manh Do Nov 21 '22 at 08:19
  • You don't need to include the folder in your command. `git rm --cached *.csv` will remove them recursively. Also if you renamed the folder you need to rename it to what it was previously. See [Git Pro](https://git-scm.com/book/en/v2/Git-Basics-Recording-Changes-to-the-Repository) for details on `gitignore` and `git rm` – TonyArra Nov 22 '22 at 23:28