5

I'm planning my company's transition from CVS to Git, and certain engineers are wishing we still had use of CVS keyword strings such as $Id: $. I have read all about implementing this with the ident setting .gitattributes, and I understand why it's undesirable, though maybe possible for smaller source trees. (Our source is huge, so I think it would be impossible.)

Our engineers don't particularly care about having the SHA1 hash in the file. They just want to know:

  1. the date of last modification
  2. the name of the committer
  3. possibly the commit message

They find it extremely handy to see this info in file headers when browsing code, and I can't argue with that.

What I want to know is:

Is there any way to info-stamp all the staged files at the moment before git commit? In other words, to run a perl command -- that replaces $Id: $ with a block of desired info -- on the working copies of files being committed?

This wouldn't require any .gitattributes actions at all. Git would just need to know how to merge two such blocks of info, ideally by choosing the later one. The stamped info would be just one more file change in the newly created version.

I looked at doing this in the pre-commit hook, but it seems to be intended differently -- not for editing files, just for checking them. Am I right about that?

And has nobody tried this approach? It seems simpler to me than trying to filter all source files every time git changes versions, which it what it sounds like .gitattributes does.

Much thanks for any advice/warnings/pointers .

Alec
  • 1,178
  • 5
  • 20
Mykle Hansen
  • 552
  • 3
  • 11

3 Answers3

3

RCS (and thus CVS) does the expanding of $Id:$ and such on checkout, those aren't in the saved files. And they can't really be, somebody might come around and rename version 1.8.2-rc10 to plain 1.8.2. If somebody wants to know where file comes from, git log file answers that nicely, with more details than RCS could ever give. And it is a local command, no trip to the CVS server (and thus available everywhere where git is, and instantaneous).

vonbrand
  • 11,412
  • 8
  • 32
  • 52
  • 1
    Again, it's not the versions that we care about adding to the file. We just want the most recent edit date and the name of the editor. Is that a bad thing to add to the file at checkin? – Mykle Hansen Mar 14 '13 at 21:36
  • You get that (and _much_ more) the way I say... `git` is _different_ than CVS, don't expect it to work the same. – vonbrand Mar 14 '13 at 22:06
  • 1
    Explain to me how "foo log file" (git | svn | cvs) tells me anything about a file copied out of the source control workspace and deployed in test or production??? If *I* did the copying, I don't need to know, it's when *somebody else* screws it up, and then you *badly* need to know. – Roboprog May 22 '13 at 01:23
3

The Keyword Expansion section of the git documentation explains how to make clean keyword expansion.

A ruby script for expanding what you want would be something like this (not tested)

#! /usr/bin/env ruby
data = STDIN.read
last_info = `git log --pretty=format:"%ad %cn %s" -1`
puts data.gsub('$Last$', '$Last: ' + last_info.to_s + '$')

setup filter

$ git config filter.last_expander.smudge expand_last_info
$ git config filter.last_expander.clean 'perl -pe "s/\\\$Last[^\\\$]*\\\$/\\\$Last\\\$/"'

setup .gitattributes

echo '*.txt filter=last_expander' >> .gitattributes

Note: (just as vonbrand says) what this gives you, and what you in all likelyhood want, is field expansion on checkout and removal of the fields on commit. But the effect is that your engineers will be able to read the field in the checked out files in their working directory. Isn't that what they want? And this will not mess up the actually versioned content with any redundant metadata.

Klas Mellbourn
  • 42,571
  • 24
  • 140
  • 158
  • I am specifically asking about adding/updating the fields at checkin, and not having to do anything at all at checkout. I don't get what's bad about that. Having to smudge/clean every file in the repo at every checkout/checkin will be very slow for us. (That is my understanding of Git Attributes processing -- am I wrong?) – Mykle Hansen Mar 15 '13 at 17:35
  • I think I'm talking about having the 'clean' filter fill in or update the infostamp symbol in files, and not having any 'smudge' filter at all. It seems like it would work, but nobody does it this way ... I just don't get why not. – Mykle Hansen Mar 15 '13 at 17:47
  • One thing that would definitely be bad with that is that updating these fields at every commit means that they would be different for every version of the file. That means that they would show up in every diff of every version of the file. It also means that these fields would create a merge conflict, for *all* merges. And why is it important that these fields are committed? Isn't the important thing that the engineers can see the fields in the checked out files, in their working directory? They cannot easily see what is in the repository. – Klas Mellbourn Mar 15 '13 at 21:13
0

This is how you solve this:

  1. Add the following pre-commit hook:

    #!/bin/sh
    git diff --cached --name-only -z --diff-filter=ACM |
            xargs -r0 .filters/keywords --
    git diff --cached --name-only -z --diff-filter=ACM |
            xargs -r0 git add -u -v --
    
  2. Add the following commit-msg hook:

    #!/bin/sh
    awk '!/^[[:space:]]*(#|$)/{exit f++}END{exit !f}' "$1" && exit
    # NOTREACHED unless commit was aborted
    git diff --cached --name-only -z --diff-filter=ACM |
            xargs -r0 .filters/keywords -d --
    git diff --cached --name-only -z --diff-filter=ACM |
            xargs -r0 git add -u -v --
    
  3. Download ".filters/fraubsd-keywords" but rename file to "keywords":

    https://raw.githubusercontent.com/freebsdfrau/FrauBSD/master/.filters/fraubsd-keywords

  4. Edit "keywords", changing in the CONFIGURATION section at top:

    • FrauBSD to Header
    • _FrauBSD to _Header

After which, each time you do a git commit the text $Header$ and/or $Header: ... $ will be translated into $Header: file YYYY-MM-DD HH:MM:SS GMTOFFSET committer $

NOTE: You may have to modify a small section of the "keywords" script to operate on more-or-less types of files. At the time of this writing it only operates on files that are "ASCII text" or "shell scripts".