- I can't pull remote because that will overwrite the existing local files
Right. So don't do that.
- I can't commit all local changes and push because they don't share the same commit history
You never commit changes in the first place. This statement starts with an incorrect assumption! That's where we'll fix things.
- I can't force push also, because that will delete the commit history in remote
This part is right, so we won't do that.
Now, before we go about fixing #2 above, let's get into this part:
Both remotely and locally I have the ~/.mozilla/firefox
also backed up, so it contains binary (or similar) files which are neither readable nor writable, but have been changed while using firefox, including .db
files. So for those files, I just wanna keep the local ones, and ignore the remote ones.
This is ... difficult, and also illustrative: it shows the difference between a backup system (which should save and restore these databases) and a version control system (which should not). Git is a version control system, not a backup system; as such, you really don't want to store these databases in it. But you already did, so you're stuck with that. There is no good Git solution to this issue. Consider redoing everything so that these databases are excluded from your version control. (Whether you want to include or exclude browser bookmarks is a separate question, but note that these are typically imported and exported as XML or modified HTML or some such, and Git's merge algorithms perform poorly with these file formats.)
With that caveat—that these version-controlled databases are going to be a problem and there is very little you can do about it—let's go on to items 1 and 2 above. You have been led astray by learning the git pull
command. It's not inherently bad, but git pull
is composed of two more-basic commands:
git fetch
, which you do need to use; followed by
- a second Git command (usually
git merge
by default), which you must not use here.
Knowing that git pull
= git fetch
+ second Git command, and what each of these two commands do, would have gotten you a lot closer to your answer. All you need to do is:
- run
git fetch
to obtain all the commits from the remote named origin
(i.e., run git fetch
or git fetch origin
).
- Set things up so that you are "on" the desired branch, with the desired branch-tip as the current commit.
- Add and commit your files. You will get a full snapshot of every file you added—and only those files—in the same way that every commit holds a full snapshot of every file. The parent of the new commit you just made will be the commit you were "on" when you ran
git commit
, so the difference between the parent of the new commit, and the new commit itself, will be the differences between the files in each of those two commits.
(In other words, that's where "changes" come from. Git does not store changes. Git stores snapshots. But the snapshot-diff duality [1] [2] means we can work with changes whenever we like.
Step 2 is the hard part. The "normal" way to "get on a branch" is to use git switch
(or, for Git versions predating 2.23, git checkout
), but this asks Git to overwrite your working tree files. As these files are not (yet) in any commit(s), you definitely do not want to do this. You:
- are in a repository that has no commits until you run
git fetch
;
- are on an unborn or orphan branch;
- have no branches (see above bullet point);
- have an empty index unless you have already run
git add
;
- have a non-empty working tree with various dot-files.
I've reproduced this condition:
$ mkdir tt && cd tt && git init
Initialized empty Git repository in ...
$ git remote add origin ../t
(The ../t
bare repository here is a repository full of dinky test files and other random stuff from old stackoverflow answers.)
$ git status
On branch master
No commits yet
nothing to commit (create/copy files and use "git add" to track)
$ git branch
$
So, no branches, no commits; let's run git fetch
and populate with commits:
$ git fetch
remote: Enumerating objects: 125, done.
...
* [new branch] branch -> origin/branch
* [new branch] fix-signal -> origin/fix-signal
* [new branch] foobranch -> origin/foobranch
* [new branch] master -> origin/master
* [new tag] tag-foo -> tag-foo
$ git status
On branch master
No commits yet
nothing to commit (create/copy files and use "git add" to track)
Note that while I'm still on my own master
branch, my master
branch does not exist. I can change the name of this unborn branch with git checkout -b
or git switch -c
or, if my Git is new enough, git branch -m
(move, i.e., rename, branch). It's up to you what branch name you want to use here. For illustration I'll switch mine to main
. Then I will create it based on the upstream master
, which I now have in my tt
repository as origin/master
:
$ git switch -c main
Switched to a new branch 'main'
$ git branch main origin/master
Branch 'main' set up to track remote branch 'master' from 'origin'.
$ git status
On branch main
Your branch is up to date with 'origin/master'.
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
deleted: .gitignore
deleted: a.py
deleted: ast_ex.py
deleted: bar
deleted: clob.c
deleted: closure.py
...
$ git rev-parse HEAD
11ae6ca18f6325c858f1e3ea2b7e6a045666336d
Note how it now looks as though I've deleted every file. This is because the index (aka staging area) remains empty. I have to git add
files to populate it, or run commands such as git restore -S
to copy files from the current or HEAD
commit into Git's index, or both.
The current commit is now 11ae6ca18f6325c858f1e3ea2b7e6a045666336d
. That's the commit I specified when I ran git branch main
to create the name main
. I did that by writing origin/master
, but note:
$ git rev-parse origin/master
11ae6ca18f6325c858f1e3ea2b7e6a045666336d
There's that same hash ID: origin/master
means commit 11ae6ca18f6325c858f1e3ea2b7e6a045666336d
. Branch names like main
and remote-tracking names like origin/master
are just ways we have Git remember hash IDs for us.
If I want, I can now run git reset
, which does a --mixed
reset, which means it moves the current branch name to the commit I specify. I'll use the default, which is HEAD
, which is the current commit specified by the name main
which now holds the same commit hash ID as origin/master
which is the commit I'd like to "append to" after all. (That's the commit I chose with git branch main origin/master
!) Then, having "moved" the current branch main
from 11ae6ca18f6325c858f1e3ea2b7e6a045666336d
to 11ae6ca18f6325c858f1e3ea2b7e6a045666336d
—i.e., not having moved at all—git reset --mixed
will read into Git's index all of the files from the commit I moved to. So now no files will be staged for deletion. Instead, the index / staging-area now matches the current commit, and git status
reports on the difference between the index and my working tree (mine is empty, yours won't be):
$ git reset
Unstaged changes after reset:
D .gitignore
D a.py
...
$ git status
On branch main
Your branch is up to date with 'origin/master'.
Changes not staged for commit:
(use "git add/rm <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
deleted: .gitignore
deleted: a.py
...
(the git status
list is the same as the git reset
list here, and again consists of every file in Git's index, which is now every file in the current or HEAD
commit).
If I didn't want to fill Git's index like this, I can git rm -r --cached .
or (a bit of a special case hack) git read-tree --empty
now. But in general this is what you want to do:
- You want to populate the Git objects database with commits, using
git fetch
.
- You want to set up the correct orphan/unborn branch name if necessary.
- You want to create the branch name at the correct commit (as found by
origin/whatever
), so that your next commit will use this commit as its parent.
- Then you want to build a new commit as usual.
You can, if you like, set up your new branch as a different branch—not main
or master
or whatever—and you can set it up without an upstream using git branch --no-track newbr origin/master
for instance, or you can remove the upstream later with git branch --unset-upstream
. You can git restore -S
(but not -W
) and git reset --mixed
(but not --hard
) if you like. These are all just fiddling around the edges: the fundamentals you want are those in the bullet points above this paragraph.
On a completely different note: dotfiles repositories
I like the idea of storing (some / many / most of) my "dot-files" in a repository. What I don't like is having a .git
repository in my home directory, where those dot-files live. So what I did was write an overly fancy program: I put my committed dot-files into a repository and then have the program install them into place, mostly with symlinks wherever that works. This lets me pick and choose which dot files actually get saved and hence work around problems like the Firefox binary databases.
Mine is messy and highly imperfect and I have not fussed with it for a few years at this point. It's probably not a great starting point for anyone else. But I think the general idea is sound enough: don't store the dot-files in a Git repository, store prototype dot-files that get copied or symlinked or whatever. Maintain the prototypes, not the actual files, so that you can accommodate quirks as needed. In other words, add a level of indirection.