10

How do I, using git, list all files in a particular directory together with the owner/identity of those files at first commit?

Is getting slices of information like this across many files usually difficult?

Edit: Okay, git doesn't provide a direct way to do this, but it does store who commits various files, right? I need this list in a particular directory so I can get a sense of which files I'm 'responsible' for.

waldyrious
  • 3,683
  • 4
  • 33
  • 41
Rex Butler
  • 1,030
  • 2
  • 13
  • 24

5 Answers5

16

Give this a try:

$ cd thatdirectory
$ git ls-files |
  while read fname; do
    echo "`git log --reverse --format="%cn" "$fname" | head -1` first added $fname"
  done

The "first added" can be misleading in case of renames.

Refs:

tutuDajuju
  • 10,307
  • 6
  • 65
  • 88
holygeek
  • 15,653
  • 1
  • 40
  • 50
  • This will only list files in the current tree. If you need to see paths that existed in prior versions or different branches, look at my answer. – sehe Sep 28 '11 at 08:27
  • 1
    I'm using git 2.20.1 and this solution lists the last modifier not the first. This is because the -1 is applied before the --reverse as specified in the man page for git log: "Note that these are applied before commit ordering and formatting options, such as --reverse." – David Snape Feb 21 '19 at 14:38
  • 1
    I've updated the answer based on @david-snape's comment – holygeek Feb 22 '19 at 10:31
6

I happened to run into a similar situation. The accepted answer is working, but why don't you guys use find with working copy?

find . -type f -exec git log --reverse --format="{} %cn" -1 {} \;
waldyrious
  • 3,683
  • 4
  • 33
  • 41
pinxue
  • 1,736
  • 12
  • 17
  • This is nice and short. The fact that it makes the filesystem leading could be a disadvantage – sehe Dec 23 '16 at 18:31
  • This answer does not work as expected because the list is limited to one commit and *then* reversed, so you will actually get the most recent commit author with this command. See here for more details: https://stackoverflow.com/a/39997772/1883900 – cscanlin Nov 20 '22 at 23:12
4

A very straightforward approach would be

git rev-list --objects --all |
    cut -d' ' -f2- |
    sort -u |
    while read name; do 
         git --work-tree=. log --reverse --format="%cn%x09$name" -- "$name" | head -n1
    done

Caveats:

  • This shows the first author name (%an) of each path that exists in the object database (not just in (any) current revision). You may also want the committer name (%cn), though be aware that if person B rebased a commit from person A that created the file, B will be the committer and A will be the author.
  • The --all flag signifies that you want all objects on all branches. To limit scope, replace it by the name of the branch/tag or just by HEAD

  • n2 performance (doesn't scale well for very large repo's)

  • improper output if the pathname contains formatting sequences (e.g. %H etc.)

It will start out with the empty name, which is the root tree object.

Daniel Compton
  • 13,878
  • 4
  • 40
  • 60
sehe
  • 374,641
  • 47
  • 450
  • 633
  • Note that in its current form, this script needs to be run from the root of the repository. It would be nice if it could filter only the files within the current directory, like in @holygeek's answer. – waldyrious Dec 23 '16 at 17:33
  • @waldyrious That's simple: just insert `| grep '^path/to/subfolder/'` after the sort – sehe Dec 23 '16 at 18:28
  • It's a pity that the filtering can't (can it?) be done in the git rev-list command. With a grep, the script spends time processing information that is later thrown away. In any case, I'd suggest editing the answer to mention the need to be in the root of the repository to run this. – waldyrious Dec 24 '16 at 11:29
  • 1
    @waldyrious `rev-list` can do filtering, but not in `--objects` mode (it's actually a different algorithm). The time spent throwing away information is likely insignificant (the cost of 2 subprocesses and a memory buffer, how large is your object database, really?). I'm not sure git would be able to avoid doing the same work anyhow. Adding `--work-tree` obviates the need to be in the root of the repo. – sehe Dec 24 '16 at 12:10
1

To do this efficiently,

git log --raw --date-order --reverse --diff-filter=A --format=%H%x09%an \
| awk -F$'\t' '
       /^[^:]/   {thisauthor=$2}
       $1~/A$/   {print thisauthor "\t" $2}
'

with maybe a |sort -t$'\t' -k1,1 or something to make it a bit prettier

jthill
  • 55,082
  • 5
  • 77
  • 137
0

Using git itself, this is not possible. Git does not keep track of the owner of the file.

Andy
  • 44,610
  • 13
  • 70
  • 69