2

I am trying to get data from git using a single command, as making multiple calls to git seems to be extremely slow. I want basic data from a commit, the hash, the author, the date, message, etc... The problem is with messages and the fact they can contain anything. They can contain the symbols I am using a delimiter between the fields. I also want to retain the new lines from the commit messages.

git log --pretty=format:%H,%an,%ae,%aD,%B:

So I would parse the output and simply split by ':' to get information for each commit and split by ',' to get each entry's information. The problem now is if the commit message has a comma or a colon then it'll change and corrupt the result.

Is there any way to sanitise output of %B or do I just have to use delimiters that no one will (hopefully) ever use or guess?

razr32
  • 339
  • 4
  • 16

1 Answers1

1

git log will output the results to stdout. If you pipe the results to another command, you are then parsing everything from stdout as one object, not as separate commits. You will have to do the parsing between commits (and the separating) yourself, likely by a specified delimiter.

I hate to tell you that you can't do everything that you're asking for... If you were to remove the requirement of needing to preserve the newline in the commit message, we could then rely on the structured and consistent format of everything prior to the commit message, parse each of them as individual items, and then parse the commit message. This relies on each comma being in a consistent place (e.g. emails don't have commas, your logged date has one comma after the day of the week).

You could also use an extremely unlikely delimiter, such as an uncommon emoji like ♡ or (sorry, Burundians). I can use these in my terminal with no problem, and they work in the git log format string.

If you're open to using other tools, this post suggests gitlogg for more advanced git logging.

Jacob
  • 1,560
  • 4
  • 19
  • 41