0

This is a question that has been calling my attention.

Say I have a folder base_folder and in this folder I have some files. I do git init on this, put a .gitignore file, and commit, no problem.

Later, for no particular reason I make a directory inside: project_folder

In this folder I put several files that are of a different nature. I build my code, everything is going great.

Then I realize that I would like to git manage the project in project_folder separately, perhaps even put it in github.

But this folder is being managed already for the repo in base_folder.

How do I git manage my repo in project_folder?

What I tried

I put a .gitignore folder inside project_folder but is being completely ignored by git. Why is that and how git ignore in subdrectories work?

KansaiRobot
  • 7,564
  • 11
  • 71
  • 150

1 Answers1

0

You need to un-do some pictures you have in your head. They're leading you in the wrong direction.

Git doesn't manage folders or files. Git manages (and stores, and uses, and transfers, etc) commits. Git is thus all about commits.

Each commit does have files, but Git doesn't really treat them as "folders": files just have names, which may well contain embedded (forward) slashes. This is the case even on Windows, where the OS uses backslashes. That is, "foo/bar" is not a file named "bar" in a folder named "foo", in Git; it's just a file named "foo/bar".

Your OS, of course, requires that these committed files be extracted to a folder named "foo". Git will therefore first create foo if necessary, so as to create foo\bar or whatever your OS insists on calling this thing. But to Git, that's just a file named foo/bar. That is, the working tree files are organized into folders. But the stored-in-Git, compressed, de-duplicated, read-only, Git-ized files in the commits are—via Git's index / staging-area—just names with slashes in them.

What this means is that if you have commits that contain files named project_folder/file1, project_folder/file2, and so on, your Git will create project_folder and inside that, create file1 and file2 and so on. If you have commits that don't have any files whose name starts with project_folder/, Git won't create a folder of that name (but won't necessarily delete one, if you create one yourself).

When you check out some commit (with git checkout or the newfangled git switch), Git will remove all the working tree files it created earlier from the current commit (that you're switching away from), and extract all the files in the target commit (that you're switching to). Once done, the target commit is the current commit, so that these files will be removed and replaced with other files if you switch yet again.

What .gitignore is about has to do with creating new commits—but it's rather indirect. When Git creates a new commit, it packages up all the files that are in Git's index / staging area right at that time. These are the same files Git extracted, in the extraction step. Git extracts them to both Git's index and your working tree, so now those files exist in both places. There are, in effect, three copies of each file, at all times:

  • There's one in the HEAD (current) commit, that Git initially extracted.
  • There's one in Git's index right now. If you haven't changed this copy, it's the one Git extracted initially.
  • Last, there's one in your working tree. If you haven't changed this copy, it's also the one Git extracted—but it's not in Git's special, read-only, compressed and de-duplicated form.

When you modify some working tree files, you presumably would like the updated copies to go into your next commit. So you run:

git add file1 file3

for instance. This tells Git: make the index copy match the working-tree copy. You changed the working tree copy of file1, so Git compresses, Git-ifies, and de-duplicates the new file1 contents. If they're a duplicate, Git finds what they're a duplicate of (in any earlier commit, anywhere) and they're now de-duplicated and ready to go. If they're all-new contents, Git prepares the all-new, compressed, ready-to-go contents.

Git now repeats this for file3, since your git add listed both files. If there's no index copy of file3 yet, because it's all-new, Git compresses the contents, checks to see if this duplicates any previous file ever committed, and prepares the index copy either way, just as it did for file1. Now there's an index copy of file3, ready to go in the new commit.

When you run git commit, Git simply packages up everything that is in the index right now. So Git's index, which we also call the staging area, is your proposed next commit. What you do with git add is update your proposed next commit, using the regular (non-Git) files you are manipulating in your working tree.

If you accidentally copy, into Git's index / staging area, some file such as config/local.config that you didn't mean to propose including in a future commit, you will need to remove this Git-ified file from Git's index / staging area. You can do this with git rm --cached config/local.config, for instance.

The .gitignore file is about keeping certain working tree files out of the staging area. It only affects files that aren't in the staging area right now. It tells Git: When I use some git add command that would otherwise add this file as a new file, don't add it. This lets you use git add . or git add * without having to worry about these particular files.

In the end, then, these are your building blocks:

  • Git stores commits.
  • Each commit gets a unique (across all Git clones everywhere) hash ID.
  • You make new commits by adjusting what's in Git's index, aka the staging area. While git status shows you what's in the staging area, it does so by comparing the staging area (twice: once to the current commit, then a second time, to your working tree). It doesn't bother mentioning the things that match: it only tells you what's different.
  • You adjust what's in Git's index using git add, or—to remove something entirely—git rm.
  • You work on (regular, ordinary, non-Git) files, using non-Git editors and all other commands your computer has, in your working tree.
  • Some files that are in your working tree aren't necessarily in Git at all, or at least, aren't in the current commit, and hence maybe are not in Git's index yet either.
  • A file that is in your working tree, but is not in Git's index, is an untracked file. It doesn't matter how the file got this way, just that it is this way.
  • The .gitignore file just tells git add that particular untracked files should stay untracked, and that when git status is about to tell you about some untracked file, it should shut its yap if that file is listed in .gitignore. This lets you keep these files untracked, and not see complaints about them. It won't help you with a tracked file: you need to use git rm --cached to make that file become untracked before .gitignore will take hold. (Note that git rm without --cached removes both the index copy and the working tree copy.)

All of this is with respect to one Git repository. A Git submodule is the way you get one Git repository to refer to another Git repository. Git refuses to let one Git repository contain another Git repository, but you're allowed to have entries (in Git's index and in commits) that say use commit _____ (fill in the blank with a hash ID) from some other Git repository. For this to work, the other Git repository must exist and must contain that commit.

torek
  • 448,244
  • 59
  • 642
  • 775