1

Setting up git to track changes to a web site. The web site uses a MySQL database. I need a dump of the database to be included in every commit. Sometimes there are only changes to the database, but still I want to be able to commit this change.

I've setup a git pre-commit hook to dump the database, and add that dump file to git. The git hook looks as follows:

#!/bin/bash

systemname="mysystem"
dumppathname=".dbexports/${systemname}_dbdump_$(date +%Y%m%d_%H%M%S).sql"

./.scripts/dump_database.sh $dumppathname

if test $? -ne 0
then
  echo "Dumping database failed. Aborting the commit..."
  exit 1
else
  git add $dumppathname
  exit 0
fi

The dump script runs a mysqldumpcommand to dump the database with desires options. This command works as designed and writes the dump to the file designated by the $dumppathname variable.

The problem is with the git add. The dump file is only added every second time.

  • Git is in a clean state, i.e. all changes have been commited, no untracked files.
  • I apply some changes to the web site which reflect in some database changes only. I want to commit this state, so I do:
me@mysystem /website> git commit -m "Track database changes, only."

The response is (the first line is written by the dump script):

Dumping current database 'mywebsite' to '.dbexports/mysystem_dbdump_20211024_132309.sql'
On branch master
Untracked files:
  (use "git add <file>..." to include in what will be committed)
        .dbexports/mysystem_dbdump_20211024_132309.sql

nothing added to commit but untracked files present (use "git add" to track)

It seems the git add from the git hook did not work.

  • But then, I immediately do another commit:
me@mysystem /website> git commit -m "Track database changes, only. Second time."

The response is (again, the first line is written by the dump script):

Dumping current database 'mywebsite' to '.dbexports/mysystem_dbdump_20211024_132323.sql'
[master 87cfdb9] Track database changes, only. Second time., 1 uncommited files.
 2 files changed, 22630 insertions(+)
 create mode 100644 .dbexports/mysystem_dbdump_20211024_132309.sql
 create mode 100644 .dbexports/mysystem_dbdump_20211024_132323.sql

So, this time both the dump file form the first, as well as the dump file from the current commit seem to have made it into the commit. What am I missing?

phunsoft
  • 2,674
  • 1
  • 11
  • 22
  • I can't say for sure but my *bet* would be that the dump takes a lot of time, and that your script is committing the partial dump as it exists so far at that point. When you try a second time, the DB software queues up the request and continues writing the first file but hasn't started the second yet. When you tried a third time, the software had finished with the second and got started on the third right away, etc. – torek Oct 24 '21 at 21:12
  • All that said, database software generally wants to do *its own* backups in its own way: dumps exported like this aren't usually meant for everyday backup purposes, and Git tends to be the wrong place to store DB backups. The issue here is that Git itself *is* a database of sorts, and storing databases inside databases, Inception-style, is usually a recipe for poor performance at best. – torek Oct 24 '21 at 21:14
  • @torek I doubt this is a timing issue. The DB export is quick.; its only 1.7MB. In the shell script, one command is only after the previous has returned. And, in the meantime I have two `mysqldumps` run one after the other in the dump script. Same behaviour, and repeatable: If there are no changes to the working directory at all *before* the `git commit`, git responds with "nothing added to commit but untracked changes", and the two dump files are listed as untracked. The next `git commit` immediately thereafter commits all 4 dump files, the two from the previous commit and the two from this. – phunsoft Oct 25 '21 at 09:02
  • @torek I don't use git to backup the database. But as a matter of fact, web frame works such as Joomla, and Piwigo in this case, store part of the configuration in some tables in the database, along wirth files in the filesystem. My workflow is: I add a component, say a plug-in, on my dev site, configure and test. When all is good I want to deploy on the prod site. For this I need to apply all changes to the filesystem (component files, configuration files, etc.), as well as all the changes in the data base that belong to this change. – phunsoft Oct 25 '21 at 09:12
  • Huh, OK, well, you're well outside my experience with this database at this point. I will also say that doing a `git add` *inside* a pre-commit hook *can* work, but it is usually a bad idea because it creates potential issues with `git commit --only` and `git commit --include`. It's usually best to use a script that does all the prep work, then runs its own `git commit`. – torek Oct 25 '21 at 16:35
  • I have now created a minimal example which creates a file with the help of the `echo` command. Same behaviour. I think git should work consistently, which it doesn't. Therefore I have sent a bugreport. Curious what the reply will be. – phunsoft Oct 25 '21 at 20:48
  • @torek See my answer for a different theory for why this is failing. phunsoft can you share the link to your bug report? – Inigo Dec 16 '22 at 11:24

1 Answers1

0

The behavior you observe seems to be an artifact of how Git pre-commit hooks work in the version of Git you are using.

If you read the top answers and the comments beneath them for Can a Git hook automatically add files to the commit?, you will see that for many people / for some versions of Git, the pre-commit hook will execute any git add within the hook script after the commit. This would mean that instead of getting committed, your database dump file is just getting staged. And then it gets committed on your next commit, which would be why you see it getting committed every other commit.

That's my theory at least.

Inigo
  • 12,186
  • 5
  • 41
  • 70