0

If I have an orphaned branch or commit, is it safe to assume that the contents of this commit or branch can't be accessed via Git or GitHub?

The reason I am asking is that I want to squash all my commits in one of my private repositories, which may have (throughout its history) had a secret key or similar pushed as a commit.

By squashing everything and pushing it as an orphaned branch, I am assuming all the other commits that may contain the secrets are entirely gone, and only the initial contents and the latest contents can be seen.

Is this assumption correct?

I hope I've written this question well enough.

Mathias Lykkegaard Lorenzen
  • 15,031
  • 23
  • 100
  • 187
  • 1
    Does this answer your question? [How to remove a dangling commit from GitHub?](https://stackoverflow.com/questions/4367977/how-to-remove-a-dangling-commit-from-github) – mkrieger1 Jun 05 '20 at 13:38
  • 1
    See also [Does GitHub garbage collect dangling commits referenced in pull requests?](https://stackoverflow.com/questions/15261880/does-github-garbage-collect-dangling-commits-referenced-in-pull-requests) – mkrieger1 Jun 05 '20 at 13:38
  • 2
    If the secret was *ever* commited to GitHub, assume someone was able to get it. Create a new key and don't commit *it* to the repository. – chepner Jun 05 '20 at 14:37
  • https://stackoverflow.com/a/32840254/7976758 – phd Jun 05 '20 at 15:08
  • @chepner I did indeed create a new key, but I guess I am worried that I may have missed something else among all those commits. – Mathias Lykkegaard Lorenzen Jun 05 '20 at 16:53

1 Answers1

0

The purpose of git is to preserve information, not to lose it.

It is not correct to assume that something cannot be accessed just because it is orphaned. Orphaned or not, if this thing persists at all, it is because it is accessible. Accessibility and persistence are the very same thing in git. Usually a thing is persistent / accessible because it has a name, or is pointed to by something that has a name. A branch is a name.

Therefore, by definition, this thing is accessible. So your idea of pushing as an orphaned branch is a complete red herring. An orphaned branch is a branch.

To put it conversely, if this thing could not be accessed, it would die — it would be garbage collected. Thus it would not be preserved. It would be inaccessible because it was gone. That does not sound like what you want.

Now then, let's focus on the other part of the question, which seems to be this. At some point, you committed a piece of information in a file. So let's imagine you did something like this:

A <- B <- C <- D <- [mybranch]
     ^         ^
     |         |
    add      delete

Here I am supposing that in commit B you added a piece of information to a certain file, and in commit D you realized that that was a bad idea and deleted that piece of information.

So the question now is: if B and C are the only commits in which that piece of information appears in that file, and if we now squash all those commits down to a single commit, is the information gone?

Yes, it is. The squash results in a single commit D´ that gets you from A to D:

A <- D´ <- [mybranch]

Since A doesn't contain the unwanted information, and D´ doesn't contain it either, it doesn't appear anywhere. If you doubt this, you can confirm it easily with git show, which will display the target file as it appears in A and as it appears in D´.

To demonstrate:

$ git init
$ echo good > A.txt
$ git add .
$ git commit -m"A"
$ echo bad >> A.txt
$ git add .
$ git commit -m"B"
$ echo indifferent >> A.txt
$ git add .
$ git commit -m"C"
$ cat A.txt
good
bad
indifferent
$ echo good > A.txt
$ echo indifferent >> A.txt
$ git add .
$ git commit -m"D"
$ cat A.txt
good
indifferent
$ git log
commit a61db7370bb38986e7cd6394a03491d185f0da08 (HEAD -> master)
commit ab271d3e51cfcac447399bfa2b5e5500c0fa0261
commit 496d8f9332372609856cc8f96565ef038658e81b
commit f3eccf8c081611ab13a9fc06cccb3c9b3b4e2275
$ git reset --soft f3eccf8
$ git commit -m"Dprime"
$ git show f3eccf8:A.txt
good
$ git show HEAD:A.txt
good
indifferent

As you can see, f3eccf8 and HEAD are the only remaining commits, and A.txt is the only file, and bad doesn't appear in that file in either of the commits. Therefore it never appears, which is the desired outcome.

Note, however, that in this simple example we have a straightforward history with no other commits and no other branches. I have no idea what other references you may have to any "bad" commit that may contain the unwanted information. If there are any such references, the unwanted information persists. The reason is that squashing B, C, and D down to a single commit D´ does not edit or destroy B, C, and D; so it is quite possible for them to persist, still containing the unwanted information. If no other reference points to them, then yes, they will go out of existence eventually; but I don't know whether that's the case.

matt
  • 515,959
  • 87
  • 875
  • 1,141