3

I have a repository with a bunch of C files. Given the SHA hashes of two commits,

<commit-sha-1> and <commit-sha-2>,

I'd like to write a script (probably bash/ruby/python) that detects which functions in the C files in the repository have changed across these two commits.

I'm currently looking at the documentation for git log, git commit and git diff. If anyone has done something similar before, could you give me some pointers about where to start or how to proceed.

UnchartedWaters
  • 522
  • 1
  • 4
  • 14
  • 1
    Might be worth checking out what output it provides for your changes, but the `-W`/`--function-context` flag to `gif diff` might be a good starting point. – Chris Aug 06 '17 at 19:17

2 Answers2

2

That doesn't look too good but you could combine git with your favorite tagging system such as GNU global to achieve that. For example:

#!/usr/bin/env sh

global -f main.c | awk '{print $NF}'  | cut -d '(' -f1 | while read i
do
    if [ $(git log -L:"$i":main.c HEAD^..HEAD | wc -l) -gt 0 ]
    then
        printf "%s() changed\n" "$i"
    else
        printf "%s() did not change\n" "$i"
    fi
done

First, you need to create a database of functions in your project:

$ gtags .

Then run the above script to find functions in main.c that were modified since the last commit. The script could of course be more flexible, for example it could handle all *.c files changed between 2 commits as reported by git diff --stats.

Inside the script we use -L option of git log:

  -L <start>,<end>:<file>, -L :<funcname>:<file>

       Trace the evolution of the line range given by
       "<start>,<end>" (or the function name regex <funcname>)
       within the <file>. You may not give any pathspec
       limiters. This is currently limited to a walk starting from
       a single revision, i.e., you may only give zero or one
       positive revision arguments. You can specify this option
       more than once.
Arkadiusz Drabczyk
  • 11,227
  • 2
  • 25
  • 38
1

See this question.

Bash script:

#!/usr/bin/env bash

git diff | \
grep -E '^(@@)' | \
grep '(' | \
sed 's/@@.*@@//' | \
sed 's/(.*//' | \
sed 's/\*//' | \
awk '{print $NF}' | \
uniq

Explanation:

1: Get diff

2: Get only lines with hunk headers; if the 'optional section heading' of a hunk header exists, it will be the function definition of a modified function

3: Pick only hunk headers containing open parentheses, as they will contain function definitions

4: Get rid of '@@ [old-file-range] [new-file-range] @@' sections in the lines

5: Get rid of everything after opening parentheses

6: Get rid of '*' from pointers

7: [See 'awk']: Print the last field (i.e: column) of the records (i.e: lines).

8: Get rid of duplicate names.

Community
  • 1
  • 1
UnchartedWaters
  • 522
  • 1
  • 4
  • 14