6

I am trying to move files from one local git repository to another local git repository for a different project while preserving history from the original repository. So far I have this, which is working fine if the file was never moved or renamed in the source repo:

# Executed from a directory in the target repository
( cd $SOURCE_REPOSITORY_DIRECTORY && git format-patch -B -M --stdout --root $SOURCE_FILENAME) | git am --committer-date-is-author-date

This happens to work because the directory structures of the two repositories are the same. If they were different, I'd have to create patch files and fix up the directory names using sed or something.

Anyway, this is all swell until I hit a file that has been renamed. Even though I'm specifying the -B -M (and get the same results with -B -M -C --find-copies-harder) I do not get the patches from before the move, even though the file was cleanly moved (similarity index 100%).

This is particularly odd since git log --follow shows all the commits and git log --follow -p provides all the diffs. Except it provides them in reverse order so I cannot feed them into git am.

Note also that git log --follow -p filename puts out the following "patch" to show the rename:

diff --git a/old_dir_name/dir1/dir2/filename b/new_dir_name/dir0/dir1/dir2/filename
similarity index 100%
rename from old_dir_name/dir1/dir2/filename
rename to new_dir_name/dir0/dir1/dir2/filename

Now if git log would display the patches in the right format and right order for git am to apply them, I could just use that, but such is not the case. Using git log --reverse --follow -p filename only outputs the name change patch, nothing else.

So, how do I get git format-patch to really follow renames the way the help file/man page says it should while at the same time only outputting patches for a single file? Alternately, how do I get git log -p to produce patches in a way I can feed them into git am to recreate a file with history?

I'm using git version 1.8.4.3.

Old Pro
  • 24,624
  • 7
  • 58
  • 106
  • (Cc @OldPro) this question is fairly old now as it dates back to 2013 :) but I think [my answer below](https://stackoverflow.com/a/48062878/9164010) fully addresses your question, so could you please mark it as accepted if you agree? this would be much useful, as @jjd recently pointed out in [this other question](https://stackoverflow.com/questions/64651143/keep-commits-only-for-subset-of-files-in-git?noredirect=1#comment114421910_64651143) (now tagged as a duplicate…) – ErikMD Nov 06 '20 at 13:14
  • @ErikMD I upvoted your answer but I do not want to accept a script/plugin as an answer. Even if I did, I have not tried your script and no longer and am in a position to validate it, so I do not want to vouch for it working. In addition, I feel a proper answer should be available as a single standard `git` command with appropriate flags. So I am not going to mark your valuable contribution as the accepted answer. – Old Pro Nov 11 '20 at 22:11
  • OK I fully understand, thanks @OldPro for your reply! – ErikMD Nov 11 '20 at 22:15

3 Answers3

3

I was recently faced with the same use case as this question and I implemented a solution using Bash, so I wanted to share it as this code could be useful for other people.

It consists of a script git-format-patch-follow available on https://github.com/erikmd/git-scripts, which can be used as follows for the OP's question:

( cd "$SOURCE_REPOSITORY_DIRECTORY" && git format-patch-follow -B -M --stdout --root --follow -- "$SOURCE_FILENAME" ) | git am --committer-date-is-author-date

More generally, the syntax is:

git format-patch-follow <options/revisions...> --follow -- <paths...>

This Bash script can thus be viewed as an automated way to run the algorithm outlined by @OldPro, and I took special care to cope with corner cases, such as filenames with whitespace, multiples files passed on the CLI, or running the script from a sub-directory of the Git source repo.

Finally as pointed by this blog post, it suffices to put such a script in one's PATH for Git to integrate the script like a git subcommand git format-patch-follow.

Disclaimer: git format-patch, and thereby git-format-patch-follow, can't be applied to a non-linear history (involving merge commits).

ErikMD
  • 13,377
  • 3
  • 35
  • 71
2

I've made some progress, but it's much more manual now.

  • For each file, use log with and without --follow to see which files have been renamed/moved/copied (calling them all "renamed" for simplicity).
  • For files that have been renamed, extract the previous complete path and filename(s) from the log output.
  • Then use format-patch but give all the old names as well as the current name on the command line.

So now I have something like this:

 git format-patch -B -M -o /tmp/patches --root -- old_dir_name/dir1/dir2/filename new_dir_name/dir0/dir1/dir2/filename

which creates the patches to create the old file, rename it to the new name, and then continue patching the file. Of course the problem there for me is that the old directory doesn't exit in the new repo and the directory level has changed, so there is still some mucking about to do with getting the directory names to work.

This should be easier....

Old Pro
  • 24,624
  • 7
  • 58
  • 106
  • This is a very good approach. I usually run the command `grep "rename from" /tmp/patches/*` to verify that I haven't overlooked any renames when format-patching large number of files. If some files are output from the grepping that isn't in my command I know I need to add them and re-run format-patch. – Paul Mar 20 '17 at 08:56
1

Yuck. I think the problem is some broken logic. In particular, when you combine --reverse and --follow you must specify the old file name:

[rename foo to bar]
$ git log --follow bar  # works
$ git log --follow --reverse -- foo # requires "--" because foo is not in HEAD

This ... sort of works. Except, it then treats the file as deleted when it hits the rename, and everything stops there.

tree-diff.c contains this function:

static void try_to_follow_renames(...)
{
        ...
        /* Remove the file creation entry from the diff queue, and remember it */
        choice = q->queue[0];
        q->nr = 0;

which is called if diff_might_be_rename returns true:

static inline int diff_might_be_rename(void)
{
        return diff_queued_diff.nr == 1 &&
                !DIFF_FILE_VALID(diff_queued_diff.queue[0]->one);
}

...
int diff_tree_sha1(...)
{
        ...
        if (!*base && DIFF_OPT_TST(opt, FOLLOW_RENAMES) && diff_might_be_rename()) {

I'm making some large assumptions here, but when you go in the other order, instead of "file bar was just created, let's see if we can find a foo from which it was renamed", if the log is reversed you need to have "file foo deleted, let's see if we can find a bar to which it was renamed", and that's just ... missing, if the comment is accurate.

If you have a lot of these to do, I'd suggest attempting to add something here to remember if the diff is reversed (as it is for both format-patch and log --reverse) and change the diff_might_be_rename() and try_to_follow_renames() code as needed.

If you just have one, well, manually hacking up some diffs is probably easier. :-)

torek
  • 448,244
  • 59
  • 642
  • 775
  • Yuck is right. I have a hard enough time using git, no way am I going to try to patch the source. Thank you, though, for digging into it, and I hope you will inspire someone to clean this up. – Old Pro Nov 13 '13 at 21:53
  • (Side)Note: `git log --follow` improves a bit with git 2.9: http://stackoverflow.com/a/36615639/6309 – VonC Apr 14 '16 at 06:46
  • @VonC: It looks like that's mainly fixing some breakage introduced in git 2.0 (when both parent dir and final file name changed). – torek Apr 14 '16 at 07:02
  • @torek yes, and considering git 2.0 is from May 2014, that might involve quite a large number of users. – VonC Apr 14 '16 at 07:04