1

I have a problem, not unlike what is discussed at Step through a file's history in git; similar to p4v timelapse, but sufficiently different.

I have a Git repository, with 6 branches, 300+ files and 600+ commits.

I also have a body of code that is a (uncommitted raw files) branch of the same repository... without a .git folder. In other words, I have a set of 300+ files, no history, no commit tags or hash numbers.

I want to re-integrate this informal branch, as a formal branch.

I need to find which commit was copied, without the .git, and subsequently edited.

How can I do this efficiently, i.e. without performing a manual 'checkout' of all 600+ commits and running diff/meld and counting number of changed files?

Community
  • 1
  • 1

1 Answers1

0

You basically want to find the commit that is most similar to a certain state of the working directory. Start by creating a local branch and committing the those 300+ files, so that they become a commit. Then, use git diffs to find the commit that is most similar.

The following script should do the trick. It finds all the commits in a given range, and then estimates the number of different lines between each commit and the reference commit. Finally it finds the minimal difference.

#!/bin/bash

commit_to_compare_with=d67e
commit_range=1cb1d..e172

list_of_commits=($(git rev-list $commit_range))
num_of_commits=${#list_of_commits[@]}
minimal_diff_count=100000000

echo
echo Found $num_of_commits commits in the range $commit_range
echo

count_lines_of_diff() { git diff $1 $2 | wc -l; }

for c in "${list_of_commits[@]}"
do
  diff_count=$(count_lines_of_diff $commit_to_compare_with $c)
  echo ${c:0:4} differs from ${commit_to_compare_with:0:4} by $diff_count lines
  if [ $diff_count -lt $minimal_diff_count ]
  then
    most_similar_commit=$c
    minimal_diff_count=$diff_count
  fi
done

echo
echo Most similar commit to $commit_to_compare_with is $most_similar_commit

Here's the output I'm getting:

Found 5 commits in the range 1cb1d..e172

e172 differs from d67e by 45 lines
1431 differs from d67e by 26 lines
20e2 differs from d67e by 347 lines
fb80 differs from d67e by 347 lines
8d67 differs from d67e by 360 lines

Most similar commit to d67e is 14310bc0cf69967d4781e0aec2fd2cca21d72ac6
Adi Levin
  • 5,165
  • 1
  • 17
  • 26