13

So we have created a template project "template_proj.git".

update git version is: 2.14.1 on Windows 7 prof

We have new projects that are empty except they have one commit with a .gitignore file in them. Lets say one of these projects is called "projectA.git".

So my method is:

  1. clone template_proj.git into a folder called "Project_A". For this I use: clone template_prog.git --depth=1 --recursive
  2. Remove the remote: git remote rm origin
  3. Add the new remote: git remote add origin projectA.git
  4. Forcefully merge the projects: git pull origin master --allow-unrelated-histories

This works well. Note: The main reason that I don't just delete my .git folder from the template clone is that it has submodules.

This gives me a repo with 3 commits (which are exactly what I want):

  • the tip of template_proj.git
  • the tip (and only commit) of projectA.git
  • the commit that contains new merge of the two.

However there is the special tag/branch "grafted" associated with the the tip of template_proj.git commit. I don't really want that.

So my questions:

  • Is this an efficient way to do this operation (i.e. is there a better way)?
  • How do I get rid of the grafted tag?
  • What is the grafted tag?

I have not been able to fully understand what grafted really is/means - I did search for it and found some information but still not really sure. As a keyword in a git search it got over-ruled by more common items (or my google-fu is weak) :(

Update: Also this question here does not quite answer: What exactly is a "grafted" commit in a shallow clone? - because it does not really say why grafted is there or what to do about it (if anything). Also I don't have a: .git/info/grafts file in my repo.

Papershine
  • 4,995
  • 2
  • 24
  • 48
code_fodder
  • 15,263
  • 17
  • 90
  • 167

2 Answers2

29

It's not a tag, and you cannot remove it without making the repository non-shallow (e.g., git fetch --unshallow). It's the marker that indicates that this is the point at which the history cuts off.

You can, however, move the marker, by deepening the history. Since the marker exists at each commit at which history is cut off, if the history cuts off below the point you care about, you will not see the marker. For instance, using a depth of 2 will put the marker below the commit you get.

Background explanation

Note that computer scientists like to draw their trees upside down: instead of leaves at the top, branches towards the ground, and then a trunk sticking into the ground, computer science theory people start with the trunk:

    |

and then add branches below it:

    |
   / \

and put the leaves on the bottom.

For StackOverflow purposes, I like to draw my trees with the root / trunk at the left and the leaves at the right:

        o--o
       /
o--o--o
       \
        o--o

Here we have a simple tree with two branches. Let's label the branches:

        o--o   <-- master
       /
o--o--o
       \
        o--o   <-- develop

Congratulations, you now understand Git's branches! cough OK, maybe not quite yet. :-) There's a lot more to it, including that commits form a graph rather than a simple tree, but this is all we need at this point: we have what we need to say what a shallow clone is. Let's draw a shallow clone of depth 2, made from this same repository:

        o--o   <-- master
       X

       X
        o--o   <-- develop

Here, we still have our same two branch names, master and develop, which still point to two different commits. Each of their commits point to a second (earlier) commit. Each of those two would point to the (shared) third-back commit, but we have hit our depth limit, so each has a marker on it—an X crossing out the links going back to the earlier commits.

It's this marker that you see when you run git log or anything that shows you the commits. Git needs to know that it should not try to look for more commits—the commits it does have say "my previous commit is ..." but the previous commit is missing. Without the marker, Git would tell you that your repository is broken.

If we set the --depth to 3, the marker is even further back:

        o--o   <-- master
       /
    X-o
       \
        o--o   <-- develop

but if the --depth is set to 1, the marker is right at each tip commit, and you will always see it.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Nice explanation : ) ... I think I knew about the branches and shallow copy, but just had no idea really about the grafting. Does not quite answer how to get rid of the grafted "marker" - but just about to read your other comment below... – code_fodder Apr 09 '18 at 16:45
  • That's because you're *not supposed to remove* the markers. Git needs them to know that the repository is shallow, not damaged, at each cut-off commit point. You'll find the file `.git/shallow` with a list of commit hashes that act as roots. (Or, maybe we should say that this makes Git know that the damage is OK, that it's the controlled damage of shallow-ness...) – torek Apr 09 '18 at 16:47
  • Updated my "answer" below. Ok so then that is part of my question also - `is this the most efficient way to do this?` should probably read `correct way`. Anyway, I want this to happen since I don't want the project I cloned to have any history at all... other then its last commit so that I can make a completely new project out of it in a new repo.. – code_fodder Apr 09 '18 at 16:53
  • OK, if you're making a new project that has no shared history, using filter-branch to "cement" the "graft" (which is really just "pretend everything earlier does not exist") is fine: it changes things so that the earlier commits really *don't* exist. :-) – torek Apr 09 '18 at 16:55
  • sweet : )) ... thanks for taking the time. I'll mark your answer up for that reason, but I want to leave mine there for reference (mostly my own!) – code_fodder Apr 09 '18 at 16:57
8

After looking around I finally found what I needed - I got to it after following a long chain of question --> answer --> link-to-question --> answer --> 12th-comment. Anyway here are some options:

  • git fetch --unshallow - This un-shallows your clone and basically gets back the full history. Not what I want, but I could have used this to undo the --depth=1 clone.
  • git filter-branch -f -- --all - This seems to trim off the grafted bits. Note: Without the -f option it does the job fine but it leaves the old commits kicking about so you end up with 2 trees (for lack of better word) once which starts from the grafted point and another which is all new. But this is messy to a casual onlooker - so use the force option to trim all that away.

Source of my info: how-do-i-remove-the-old-history-from-a-git-repository - 9th comment highlights the -f option (you have to expand the comments).

So this is all to do with git grafting. I did not get a .git/info/grafts file, but I did create one manually echo <SOME-COMMIT-SHA> > .git/info/grafts. When I did this I got a second grafted label on the commit-sha I selected. So I guess you can use that to pick a point to chop of the history along with git filter-branch....

Really need to read up more on grafting, but its not really a feature I am interested in at the moment - other then in this case to get rid of it :o

Update:

I had to run:

git filter-branch -- --all

then

git filter-branch -f -- --all

As a two step process... not quite sure how/why. The first one splits them and the second one removes unreachable commits or something?

Update from Toreks comments

Now I do the following:

git clone <url> --recursive --depth=1
cd into folder
git remote rm origin
git remote add origin <new url>
git filter-branch -- --all
rm -rf .git/refs/original/*

Now I can operate normally to get the empty proj/merge and then push my changes up:

git pull origin master
commit anything here if needed...
git push origin master
code_fodder
  • 15,263
  • 17
  • 90
  • 167
  • Using `git filter-branch` tells Git to *copy* the commits. The copies obey the shallow cutoff, so that they are truly the *only* commit after copying. The `-f` flag is not required if you do it in a fresh repository (it's required to *re-run* filter-branch if you have not cleaned up after an earlier one), but you should just clean up the `refs/original/` references instead. Note: copying commits like this results in new, different commits with new and different commit hash IDs. This may eventually cause pain; be sure you know what you are getting into. – torek Apr 09 '18 at 16:41
  • @torek Ah ok cool - that helps me understand... with this in mind I will add an improved "answer" if you could check it?.... 2 secs... – code_fodder Apr 09 '18 at 16:48
  • 1
    The `rm -rf` works; someday you might need to use `git for-each-ref`, but you can leave it this way until then (Git 3.x?). You might also emphasize that you're deleting the original `origin` (`git remote rm origin`), which is why the filter-branch is a sensible thing to do there. – torek Apr 09 '18 at 16:57
  • @torek lol... this make laugh sometimes... I am a c++ guy but sometimes I think there is more to learn on git :o ... I'll look into the `git for-each-ref` and stick an update for that in if I find the right incantation : ) – code_fodder Apr 09 '18 at 16:58
  • Git is in some ways like C++: there's a new standard every three years... – torek Apr 09 '18 at 16:59