0

I made a mistake of unwillingly pushing some personal data as mock data (JSON) for my application and then commitig with reference to issue in commit message. Then I did a couple more commits and then noticed the problem. I rebased the commits and made them a big one without the mock data. So these commits are orphans now and do not appear at any branch. However GitHub still preserves the links to the commits in issue. How can I remove at least one of the commits forcefully (the one containing personal data) or should I just wait for garbage collection? Is it even gonna resolve?

This question is different from removing data from Git history. I have no problem with that. The problem is with GitHub caching of that data. The problem here is not with Git, it's with GitHub.

wswld
  • 1,218
  • 15
  • 32
  • Can you remove the Issue, or at least the comments? That might be the only way. – Greg Burghardt Aug 25 '15 at 12:59
  • 2
    possible duplicate of [Remove sensitive files and their commits from Git history](http://stackoverflow.com/questions/872565/remove-sensitive-files-and-their-commits-from-git-history) – AD7six Aug 25 '15 at 13:02
  • Even if you manage somehow to make that commit disappear it doesn't resolve anything. Think only about Google, Alexa, [Wayback Machine](http://archive.org/web/) or any other of the zillions of bots that crawl the web 24/7 and index everything. Most probably your data is already duplicated in several dozens of databases across the web. – axiac Aug 25 '15 at 13:02
  • @AD7six the problem here is not with removing stuff from Git history, the problem is with GitHub caching of that data. I had a hard time finding anything on this issue, maybe that's because of people like you closing every issue to tease their ego. – wswld Aug 25 '15 at 15:33
  • 2
    @wswld I don't think you've read the duplicate question. e.g. the github link below (Remove sensitive data) is [in that question's accepted answer](http://stackoverflow.com/a/872700/761202). The purpose of closing a question as a duplicate is not ego - it's to have canonical answers to canonical problems. Duplicates act as signposts for future readers who find this first. – AD7six Aug 25 '15 at 15:37
  • @AD7six I understand why we need this mechanism, it's just for a second it seemed to me that closing the question was a bit uncalled for. I was wrong as I wasn't really attentive reading the linked quest. for the first time and got a bit defensive. Hopefully, no hard feelings. – wswld Aug 25 '15 at 15:50
  • Of course =). No malice intended in suggesting it was a duplicate - none taken in effectively asking why. – AD7six Aug 25 '15 at 15:53

2 Answers2

3

Generally, the data should be removed from all web views eventually once cached views expire and the commits are garbage collected. However, this might never happen if they are references somewhere else (e.g. in issues, pull requests, ...)

According to the appropriate Github support page you can however contact the Github support team to manually expire cached views and thus permanently remove the commits from your repository.

If however someone already pulled the data (or forked the repository in between), you can't do anything about it. The safe thing to do is to consider the data compromised and to start appropriate measures to contain the breach, e.g., to deactivate/regenerate passwords or keys.

Holger Just
  • 52,918
  • 14
  • 115
  • 123
1

From: https://help.github.com/articles/remove-sensitive-data/

This article tells you how to make commits with sensitive data unreachable from any branches or tags in your GitHub repository. However, it's important to note that those commits may still be accessible in any clones or forks of your repository, directly via their SHA-1 hashes in cached views on GitHub, and through any pull requests that reference them. You can't do anything about existing clones or forks of your repository, but you can permanently remove all of your repository's cached views and pull requests on GitHub by contacting GitHub support.

Charlie
  • 7,181
  • 1
  • 35
  • 49