Make a file both versioned and locally ignored

Question

Let's say I have a configuration file config.cfg which contains some extremely important stuff.

When you clone the repository, you have to get config.cfg, or else nothing will work; therefore the config.cfg file must be in the repository.

config.cfg is also meant to be locally customizable by individual developers to suit their needs, but these customizations must remain local and never be committed. Therefore, developers must be prevented from accidentally committing config.cfg as part of their normal workflow.

In the exceptional situation where someone does actually need to commit config.cfg, they should have to invoke some magical incantation to indicate that they really mean to do so. (For example, if this situation was, hypothetically, solvable by something as simple as git-ignoring config.cfg, then the magical incantation could, also hypothetically, consist of temporarily un-listing config.cfg from .gitignore in order to commit it, and re-listing it afterwards. But it isn't. This was just an example.)

When The Guru Developer modifies config.cfg and commits it, (invoking the beforementioned magical incantation,) then all developers should get these changes next time they pull, and they should get merge conflicts, if necessary, against any local changes that they may have.

Developers should not be burdened with extra bureaucracy; in other words, they should be able to just clone the repository and run the software, they should not have to first follow lists of things that need to be done.

I am trying to achieve this scenario.

First I tried committing config.cfg and then adding it to .gitignore; it does not work, because it is a well known fact that git silently fails to ignore entries that happen to already be committed.

Then I tried git update-index --skip-worktree config.cfg; it does achieve the part where .gitignore failed, but it does not work either, because it does not prevent developers from accidentally committing config.cfg as part of their normal workflow. For this to work, every developer would have to execute this command locally, which is an untenable proposition.

Someone suggested that I should have a config.cfg with the extremely important stuff, which should remain largely untouched, and an optional git-ignored config.user.cfg for developers to go wild with their customizations. Actually, I am already doing this: the config.cfg that I was telling you about was a lie; in actuality, I was talking about my config.user.cfg, I just called it config.cfg to keep things simple. Still, nothing changes: config.user.cfg must exist or else bad things will happen, but if it is git-ignored then developers will not receive it when cloning, so it will not initially exist. Also, config.user.cfg needs to have some default boilerplate content for developers to modify; I do not want to tell developers that they have to create it from scratch. So, we are back to square one, how to have config.user.cfg both versioned and locally ignored.

How can I achieve this scenario?

The closest other question and answer that I have managed to find is How to make GIT ignore tracked file locally? where someone has to have an executable in their repository, and wants to prevent developers from building it and committing it. According to the answers in that question, it cannot be done.

The difference in reasons don't really matter: git can't do what you want. Try to find another way. One thing to check is if you can make your *build system* create the file if it's missing (and leave it untouched if it isn't). Git still can't do "both ignore and track the file at the same time". — Joachim Sauer, Aug 22 '23 at 13:14
The situation is indeed very common. I usually take the approach of creating and versioning a template file like `config_template.cfg` and adding the `config.cfg` to the `.gitignore`. Each developer copies the template and modifies his local `config.cfg` according to his setup. This has the benefit of forcing people to think (at least a bit) about what settings are right in their setup instead of just using the versioned settings. — SebDieBln, Aug 22 '23 at 13:16
A commonly used practice is to ignore the `config.user.cfg` in your `.gitignore`, but provide a `config.user.cfg.example` for others to copy (`cp config.user.cfg.example config.user.cfg`) and use. — Peter Krebs, Aug 22 '23 at 13:19
Does it absolutely have to be a file? How about: keep config in environment variables, with `.env` naturally gitignored, and have instructions how to build correct `.env` file in readme (could be even just a piece of text to copy&paste, if no secrets are involved). — mbojko, Aug 22 '23 at 13:21
You might want to look into the [`git update-index --assume-unchanged`](https://git-scm.com/docs/git-update-index#Documentation/git-update-index.txt---no-assume-unchanged). This should achieve the desired behaviour but it has to be done by each developer because unlike with `.gitignore` that setting can not be saved within the versioned data IIRC. — SebDieBln, Aug 22 '23 at 13:23
@SebDieBln skip-worktree: https://stackoverflow.com/a/13631525/717372 — Philippe, Aug 22 '23 at 16:51
Would a git-submodule solve your issue? It would require a little overhead for the devs because they need to clone recursively, and use `git pull --recurse-submodules` but it would cleanly and clearly separate the config from the rest — lucidbrot, Aug 22 '23 at 17:10
@Philippe Thanks for pointing to `--skip-worktree`. The documentation talks a lot about missing files but it surely can be used to work the problem in this question, too. — SebDieBln, Aug 22 '23 at 21:52
In the server-side hook, reject a push that touches `config.cfg` and echo what the problem is and how to fix it in detail and with a friendly solution. Leave the rest to the committers. — ElpieKay, Aug 23 '23 at 10:15

CodeWizard · Answer 1 · 2023-08-22T13:54:20.117

Here is an example of how you can resolve it.

You have several options.

Smudge/clean (Demo script below)
assume-unchanged - https://git-scm.com/docs/git-update-index#Documentation/git-update-index.txt---no-assume-unchanged
git hooks

I recommend using smudge clean since it is the cleanest option of them all in my opinion.

The clean/smudge filter is quite simple to use.
Create a filter driver with two commands—clean and smudge—and then apply the filter per record in the .gitattributes file.
The filter stand between the working directory and the staging area for the specific .gitattributes pattern (which you will have to define as well - in your case, it can be config.cfg).
When adding content from the working directory to the staging area with git add, files that match the .gitattributes pattern will go through the clean filter.
When pulling content back from the repo to the working directory with git pull, those same files will go through the smudge filter.
In the attached demo you can see that git "doe not care" what you have locally and set the content to be the desired one as you wish.

A working demo

copy the script and run it, it's working.

#!/bin/bash

# SET your remote repo
REMOTE_REPO=<git....>

clear
# Set output the colors script
Color_Off='\033[0m'       # Text Reset

# Regular Colors
Black='\033[0;30m'        # Black
Red='\033[0;31m'          # Red
Green='\033[0;32m'        # Green
Yellow='\033[0;33m'       # Yellow
Blue='\033[0;34m'         # Blue
Purple='\033[0;35m'       # Purple
Cyan='\033[0;36m'         # Cyan
White='\033[0;37m'        # White

# Bold
BBlack='\033[1;30m'       # Black
BRed='\033[1;31m'         # Red
BGreen='\033[1;32m'       # Green
BYellow='\033[1;33m'      # Yellow
BBlue='\033[1;34m'        # Blue
BPurple='\033[1;35m'      # Purple
BCyan='\033[1;36m'        # Cyan
BWhite='\033[1;37m'       # White

### Define the desired filters.
### For the simplicity of the demo we use it inline
### In real life it can be any path to the actual script

### The ip which we wish to use
### In real life it can be passowrd, ip or any other value
DB_IP_LOCAL=127.0.0.1
DB_IP_PROD=10.10.10.10

echo -e ""
echo -e "Pre-defined values:"
echo -e "------------------------------------------------"
echo -e "${Yellow}DB_IP_PROD:\t${Green}${DB_IP_LOCAL}${Color_Off}"
echo -e "${Yellow}DB_IP_PROD:\t${Green}${DB_IP_PROD}${Color_Off}"
echo -e ""

echo -e "${Cyan}* Creating\t demo repository${Color_Off}"
### Create the demo repository
rm -rf      /tmp/demo_smudge
mkdir -p    /tmp/demo_smudge
cd          /tmp/demo_smudge

# Generate the .env file
echo -e "${Cyan}* Initializing\t .env file${Color_Off}"
cat << EOF >> .env
## Database
##  * Local:      <Any Value>
##  * Production: 10.10.10.10
database.ip=0.0.0.0

## Feature1
feature1.env=DEV
feature1.key=f1-key
feature1.name=feature1
EOF

echo -e ""
echo -e "Current .env file content"
echo -e "------------------------------------------------"
echo -e "${Green}"
cat .env
echo -e "${Color_Off}"
echo -e "------------------------------------------------"
echo -e ""

### Init the empty repository
echo -e "${Cyan}* Initializing demo repository${Color_Off}"

## Init git repo
git init --quiet
# Add the demo remote repository
git remote add origin ${REMOTE_REPO}

# Add all files
echo -e "${Cyan}* Adding content to demo repository${Color_Off}"
git add .

# Commit changes
echo -e "${Cyan}* Committing content to demo repository${Color_Off}"
git commit -m"Initial Commit without smudge-clean" --quiet

echo -e "${Cyan}* Pushing content to demo repository${Color_Off}"
echo -e "${Yellow}${REMOTE_REPO}{Color_Off}"

git push --set-upstream origin main -f --quiet

echo -e "${Red}>>> Press any key to continue${Color_Off}"

# Wait for user input to continue (max timeout 600 seconds)
read -t 600 -n 1

### MacOS users should use gsed instead of sed

# Clean is applied when we add file to stage
echo -e "${Cyan}* Define clean filter${Color_Off}"
git config --local filter.cleanLocalhost.clean  "gsed -e 's/database.ip=.*/database.ip=${DB_IP_PROD}/g'"

# Smudge is applied when we checkout file
echo -e "${Cyan}* Define smudge filter${Color_Off}"
git config --local filter.cleanLocalhost.smudge "gsed -e 's/database.ip=*/database.ip=${DB_IP_LOCAL}/g'"

###  Define the filters 
echo -e "${Cyan}* Adding filters (smudge-clean) to demo repository${Color_Off}"
echo '.env text eol=lf filter=cleanLocalhost' > .gitattributes

### Commit the file again after we set up the filter
echo -e "${Cyan}* Adding second commit${Color_Off}"
echo 'Second Commit' >> README.md

echo -e "${Cyan}* Adding the same file (.env)${Color_Off}"
git add .

echo -e "${Cyan}* View the diff (.env)${Color_Off}"
echo -e "------------------------------------------------"
git --no-pager diff --cached .env
echo -e "------------------------------------------------"

echo -e "${Cyan}* Commit changes${Color_Off}"

git commit -m"Second commit with smudge-clean" --quiet
echo -e "${Cyan}* Pushing second commit to git${Color_Off}"
git push --set-upstream origin main --quiet

That's funny how nearly every time in this case, `assume-unchanged` (for perf on big files that are never changes) is advised whereas it is in fact `skip-worktree` (for tracked files changed that you don't want to commit) that should be used: https://stackoverflow.com/a/13631525/717372 — Philippe, Aug 22 '23 at 16:49

score 0 · Answer 2 · answered Aug 22 '23 at 23:41

This is a straight forward application of using temporary commits¹. No special tools required, just plain interactive rebase + a temporary commit naming scheme.

Imagining some developer's repository being

$ git log --oneline -- config.cfg
1002004 ==== increase default debug level ====
1002003 Update database driver version
1002002 ==== use local database ====
1002001 initial commit
$

they should be able to push the 1002003 commit but not 1002002 or 1002004. This should not be not difficult to enforce with a pre-push hook.

¹ Which is not limited to just a single file like config.cfg but applies to any local change in any file, for instance debug logging etc.

Make a file both versioned and locally ignored

2 Answers2

A working demo