3

I've been setting up my page on Stack Overflow Jobs and I noticed that on one of my repositories I have 3,521,316 additions and 3,459,307 deletions, which didn’t seem right, so I decided to investigate. Using GitHub’s contributions page, I localized the changes to January 26–27, where it says there were 30 commits with 3,507,040 additions and 3,453,801 deletions. When I click the 30 commits text to see the commits however, there are only two, with a total of 208 additions and 152 deletions. I even checked all the other branches to see if they had other commits in that time range, and none of them did.

I'd like to have the contribution counts accurate for my SO Jobs page, on top of just wanting them accurate for accuracy’s sake, but I have no idea why they’re so wildly incorrect or how to correct them. I've searched online for solutions, but everything I found is about contributions not appearing, not about too many contributions appearing.

thecodewarrior
  • 196
  • 3
  • 11
  • Something with the *contributors* graph seems wrong because it does not show [this commit on Jan 31](https://github.com/thecodewarrior/Bitfont/commit/4096f9c8ca1be52e9d3997ae4fc7037170465adb) which has 500k +/-. – Matt Clark Feb 07 '20 at 21:25

3 Answers3

3

After stepping back through the commit history, there were some massive commits where I changed a number of huge JSON data files, so it doesn’t appear to be an error on GitHub’s part (aside from ascribing all of those changes to a single day on the Contributors page). Knowing that there were in fact a huge number of line changes, I set out trying to work out how to ignore those files, and ran across this issue, which led me to this section of GitHub’s Linguist project’s README. After a bit of fooling around I figured out that by marking the files as generated in the .gitattributes file they would be excluded from diffs, and thus presumably their lines would be excluded from the total contributions. As of right now my total contributions have not been corrected, but the Linguist page noted that the updates are run on a lower-priority queue, so it may take some time.

To ignore a file, add one of these attributes to it in your .gitattributes file. The .gitattributes file uses the same pattern syntax as .gitignore files. If you need to do so retroactively, you’ll need to add/modify the .gitattributes file, create a commit, then rebase to insert it into the past.

*.txt linguist-generated
# `linguist-generated` marks a file as generated, so it won't count toward
# language statistics or commit additions/deletions.

README.txt -linguist-generated
# prepending an attribute with a `-` removes it from the file

/libs/somelibrary.js linguist-vendored
# `linguist-vendored` marks a file as an external file such as a library. This 
# file will still appear in commit diffs, but it won't contribute to the
# repository's language statistics

/docs/** linguist-documentation
# `linguist-documentation` marks a file as documentation. This has the same
# effect as `linguist-vendored`.

/configs/*.json linguist-detectable
/tools/merge_configs.py -linguist-detectable
# `linguist-detectable` marks a file to be counted in language statistics. 
# By default it is enabled for programming languages, so you can use it to 
# either include non-code files, or exclude code files.
thecodewarrior
  • 196
  • 3
  • 11
1

I had the same issue with GitHub repository contributions graph and finally I found out how to fix this. Worth mentioning that linguist didn't help me.

If something went wrong with the first part or the second one, here you can find how to undo the rebase. And please don't forget to keep a backup of your repository.

Changing the commit's author name

Note: this will rewrite the repository commits history and all commits that were pushed after the bad commit will be visible as pushed today in the {repo-url}/commits/{branch} (the commits' original date won't change).

More detailed source.

  • So, first of all, create a backup of your repository and save somewhere.

  • Now you should find your bad commit hash

  • Then find the commit that goes before that commit (earlier commit). You can do this with git log.

  • Copy that commit hash and do the following:

    git rebase -i earlier_commit_here
    
  • The text editor will appear. Find your bad commit and change the word "pick" to "edit" next to it (to do this press i button on the keyboard, then change text, press Esc and type :wq to save and quit).

  • Now you should change you commit's author to something similar without e-mail (source):

    git commit --amend --author="nocontribute <>"
    
  • To finish rebase type the following:

    git rebase --continue
    
  • Force push the history:

    git push --force
    
  • Wait a little bit for contributor's graph update (up to 24 hours).

In my case I messed with my repository and the steps above didn't help. If your contributor's graph is still the same, you can do the following:

Creating a new repository and moving all commits to it

The main idea of these steps is to cherry-pick and push the bad commit separately from the other commits. So if you have more than one bad commit keep it in mind. If you push all commits together, it will count your bad commit with "nocontribute" author. Weird but true. It took much to time to identify this.

The previous steps should be done as well to proceed.

Note: this will not only rewrite the repository commits history but also will clear all your statistics (as you will create a new repository). Only the commits will remain.

Detailed source for moving commits to another repository.

  • Change your current's name repository on GitHub to something like currentname-outdated.

  • Create a new repository with the name currentname and clone it (don't forget to rename your previous cloned repository name).

  • cd to the new cloned repository directory and type the following:

    git remote add oldrepo https://github.com/path/to/oldrepo
    
  • Then update it:

    git remote update
    
  • Now cd to the old (outdated) repository and save to file your commits log:

    git log --pretty="format:cp %h" > commits.txt
    
  • Then reverse lines in this file (you can use either command line tools (worth mentioning that tail command didn't work as I expected, it added two commits on one line) or use online tool like this) and save it to commits.sh file.

  • Now open commits.sh file and copy & cut the bad commit and all commits after it and save somewhere else. So you will have commits before the bad commit in your commits.sh file.

  • Create a new alias for cherry-pick:

    alias cp='git cherry pick '
    
  • cd to your new repository and execute commits.sh:

    sh ../currentname-outdated/commits.sh
    
  • Push changes to repository:

    git push
    
  • Check your contributors graph here: {repo-url}/graphs/contributors. Check if everything's alright.

  • Then take your bad commit, cherry-pick it and push:

    cp bad_commit
    git push
    
  • Check contributor's graph again.

  • If everything's alright, open your remaining commits, delete the bad commit from there and replace commits in commits.sh with the saved commits (except the bad commit since you've already pushed it).

  • Execute commits.sh again:

    sh ../currentname-outdated/commits.sh
    
  • Check your graph.

  • Now proceed with moving other items from your old repository (like issues, labels, tags and so on).

Hope it helps someone.

0

I checked and it looks like some issue with GitHub. There is a workaround I tried and it worked, go to the link for GitHub you shared(for the time frame between 26 and 27 Jan 2019) then select the Additions option from the contribution drop down, top right, see the screenshot below enter image description here Now the contribution graph will re-render, again now click on the same drop-down menu and select the option Commits. Now click on the 30 commits you'll find all your commits now.

karn
  • 80
  • 1
  • 10
  • That's interesting that after doing that it's showing all the commits. I went to the latest commit and am going up the parent chain and just found a 1.4m line commit which is modifying a JSON data file. I think I might go back and minify all these JSON files to make them ±1 line so they don't bloat everything. – thecodewarrior Feb 07 '20 at 21:34
  • After a bit of digging I think I might be able to just instruct GitHub to ignore these files: https://github.com/github/linguist#vendored-code – thecodewarrior Feb 07 '20 at 21:39
  • Haven't heard of `vendor.yaml` until today, interesting. – karn Feb 08 '20 at 04:34