0

I have the folder on Desktop on my mac. I really want to upload this to a particular branch of a repository of on my friend GitHub account. The lists below are what I do not know how to do however I search it on the internet.

  1. How to access the friend account / the branch name
  2. how to push my folder to the branch, not master? or origin?
  3. I am opening Terminal which stages the folder which I want to upload to the branch on GitHub.

1 Answers1

0

Git is not a file transfer program. Git is a version control system.1 You may or may not be able to accomplish your actual goal, but you'll have to change how you think about this goal.

To use Git, you'll need a version built for your computer. You mentioned "mac" so we might assume you already have one. You can run git --version in a Terminal window to see which version of Git is installed.

Git operates on a repository, one repository at a time. A repository is primarily a collection of commits. Commits contain files, but there are no folders involved. The files simply have long names that include slashes, such as README.md, lib/xyz.py, and so on. Your computer will demand that the file named lib/xyz.py be stored as a folder named lib that has a sub-file named xyz.py within it, and Git will do that, but as far as Git is concerned, what you have is a file named lib/xyz.py. The upshot of this particular quirk is that you cannot store an empty folder: Git won't store a file named, say, joey/, so if you want Git to create the OS-level folder named joey, you must make a commit that has inside it a file named joey/anything. A common trick is to create the file joey/.gitignore, which can be empty: if there is a commit that has this file in it, and someone does a git checkout of this commit, Git will create joey/ if needed so that it can create .gitignore inside joey/ so as to satisfy the requirement of creating a file named joey/.gitignore.

Files inside a commit inside a Git repository are extracted relative to wherever the repository itself resides. You mentioned that the existing folder is in your ~/Desktop. For concreteness let's say these files are named j/1 and j/2 (i.e., ~/Desktop/j/1, ~/Desktop/j/2). You could—note that this is inadvisable in general—create a Git repository in your own home directory, such that you can just git add Desktop/j. But now suppose your friend copies this repository to his file system underneath, say, ~/src/github.com/joey. Extracting the commit will, on his system, create ~/src/github.com/joey/Desktop/j/1 and ~/src/github.com/joey/Desktop/j/2. In other words, you control only a relative name, not an absolute path.

Every Git repository is independent of every other Git repository. When and whether you access your friend's repository on GitHub, you can use that access to transfer commits—there's that word again—from that Git repository to your Git repository, or from your Git repository to that Git repository. Meanwhile, you friend will have a third Git repository on his computer. He will use his Git to call up the Git on GitHub, and obtain commits from that Git, or send commits to that Git. So the Git repository on GitHub is really a rendezvous, in this case.

You may or may not have permission to access your friend's Git repository on GitHub. These permissions are controlled by whoever created the repository (presumably, him). However, GitHub is built for sharing. The way GitHub encourages sharing is to use what GitHub calls a fork. A fork is very much like a clone, with some additional GitHub-provided glue. (There are other Git providers, such as Bitbucket and GitLab, all of which have adopted this same idea.) If you fork your friend's GitHub repository, this gives you a GitHub repository of your own—a fourth copy—that you can use for this rendezvous operation. It's likely that you will have to do this.2

Once you have access to GitHub (for which there are many tutorials, including ones on GitHub help pages), you will need to clone your friend's repository, or your fork of your friend's repository, so as to create a local repository on your laptop. This is where you really start getting in to using Git. It simplifies the picture a lot here if there are just two repositories here: yours on GitHub, and the clone you are about to make. So we will make this simplifying assumption: that you have in fact made a fork of your friend's repository, so that you are accessing your repository on GitHub.


1Some would argue that Git is merely a file system pretending to be a version control system, and there's a lot of merit to that argument, but let's stick with "distributed version control system" or DVCS, as the tag info in suggests.

2Having done this, you may be able to dispense with the idea of using Git on your own laptop at all: GitHub offer several ways to manipulate commits in your fork without having to clone your own fork to your own computer. If you plan to use Git a lot, though, it's far better to copy the entire repository onto your laptop, so that you can manipulate commits there easily.


A Git repository is mostly just a collection of commits

Early in the text above, we mentioned that a repository is primarily a collection of commits. To use Git, you need to understand what a commit is, what it does for you, and how it does that.

Every commit, in Git, holds a complete snapshot of some set of files. These are the files that are contained in that commit. These files are read-only, frozen in time forever. You cannot change them; even Git itself cannot change them. They are just copies of files: they have a name, such as README.md or lib/xyz.py, and some content (the contents of the README.md and the lib/xyz.py), and one bit of mode information that tells Git: this file should be executable or this file should be not-executable. For plain text files, rather than command you can run, the file should be not-executable.

A Git commit contains this snapshot as its data, but also contains some metadata. The metadata includes two human names, their email addresses, two date-and-time stamps, and some more stuff we'll get into in a moment. The two names are the author of the commit and the committer of the commit, and usually these are the same person. The email addresses are those of the author and committer, and the two date-and-time stamps are for the author and committer. You need to tell Git about yourself:

git config --global user.name '...'
git config --global user.email '...'

These are the strings that your Git will put into new commits, to hold the author and committer user names and email addresses. You can use just about anything you like here—Git does not care who or what you call yourself—but if you want people to know about you, you should put real stuff here.

Every Git commit gets its own unique ID: a hash ID. That hash ID is created at the time you create the commit. Git uses a new hash ID, one that has never been used for any other commit anywhere, and will never use it ever again for any other commit.3 Moreover, every Git in the universe will now agree, from this point on, that this hash ID is for this commit, and only this commit.

This is also why you can't change anything in any commit, ever: the hash ID is constructed from all of the commit's contents, including your name and email address, and the two date-and-time stamps, along with all of the files saved in the snapshot, and so on. This lets two Gits call each other up and merely exchange hash IDs: if their Git has hash 083378cc35c4dbcc607e4cdd24a5fca440163d17, and your Git has hash 083378cc35c4dbcc607e4cdd24a5fca440163d17, why, you both have that commit. If one of you doesn't, the other one can give it to whichever one doesn't.

But hash IDs are big and ugly and look totally random. Humans can't remember hash IDs. So Git sets things up so that we don't even have to try to do that. We have a computer, after all. It can remember these big ugly hash IDs for us. And here, there's something pretty magic.

Every commit stores some set of hash IDs inside it. Usually there is exactly one of these. This is part of the metadata, along with your name and email address and so on. When you make a new commit, Git stores, in that new commit, the hash ID of the current commit. So the new commit you just made remembers the unique hash ID of the commit that comes before it.

This is also where branch names come in. A branch name, in Git, merely holds one hash ID. The hash ID stored inside a branch name is the hash ID of the last commit in the branch.

Whenever something holds the hash ID of a commit, in Git, we say that this something points to the commit. So the latest commit points to its earlier or parent commit, and that commit points to its parent—the grandparent of the latest commit—and so on. Meanwhile, the branch name points to the latest commit.


3There's a theoretical possibility that the new hash ID for a commit won't be unique. In practice, this is not a problem. See How does the newly found SHA-1 collision affect Git? for details.


Visualizing branches and the commit process

If we use single uppercase letters to stand in for the actual big ugly hash IDs, we can draw this kind of chain of commits like this:

... <-F <-G <-H   <-- master

Here, we have one branch name—master—that points to the last commit, H, in the repository. Commit H (whatever its real hash ID is) contains the raw hash ID of earlier commit G, so that H points to G. Commit G in turn contains the raw hash ID of its earlier commit F, which contains another raw hash ID, and so on.

This whole process only ends when we get to the very first commit we—or whoever—made, in this repository. That first commit doesn't point back to any earlier commit, because it can't: there wasn't any earlier commit.

If we get lazy and draw the internal backwards-pointing arrows as connecting lines, this lets us draw out the entire eight-commit repository like this:

A--B--C--D--E--F--G--H   <-- master

This lets us see how branch names work in Git. A branch name, as we already know, points to one commit—the last one on the branch. Let's make a new branch name that also points to commit H:

A--B--C--D--E--F--G--H   <-- master, feature

Now we need a way for us (and Git) to remember which branch name we're using: is it master, or is it feature? Let's attach the special name HEAD, in all uppercase,4 to one of these names, for this purpose:

...--F--G--H   <-- master (HEAD), feature

If we run git checkout feature, this moves the attached HEAD to feature:

...--F--G--H   <-- master, feature (HEAD)

Either way, we're still using commit H. But we're on a different branch. This matters because of what happens when we make a new commit.

To make a new commit, we need to understand how the index and work-tree work. This answer is already very big, so I will just say: look elsewhere for that. Assuming we know how to use them, though, we'll change some files and git add and then run git commit. Git will:

  • use the current commit H as the parent of the new commit;
  • use our name and email address, and the current date-and-time from the computer, for the author and committer;
  • collect from us a log message in which we remind everyone else, and our future selves, why we are making this new commit now; and
  • actually make the new commit, saving a frozen snapshot of all of our files forever, or at least, for as long as this new commit continues to exist.

The parent of our new snapshot is H, so let's draw the new commit and call it I to stand in for whatever unique hash ID it gets:

...--F--G--H   <-- master, feature (HEAD)
            \
             I

Now comes the really tricky bit: having just made this new commit, Git writes its hash ID into the branch name to which HEAD is attached. This changes the name feature so that feature now points to new commit I:

...--F--G--H   <-- master
            \
             I   <-- feature (HEAD)

This is how branches grow, one commit at a time, as we work in Git. Each new commit we make points back to the commit we were using, at the time, before we made the new one. The current branch name—the one HEAD is attached to—moves to point to the newest commit, which just got a new and unique hash ID.


4On the Mac, you can often get away with typing in head in lowercase. Try not to get into the habit of doing this! It doesn't work right once you start using Git's git worktree add feature, and it does not work at all on most Linux systems. If you don't like typing HEAD in all caps, consider using @, which Git treats as a synonym for HEAD.


Making a clone

Before we can get to this point, we have to start out with a copy of the repository that we forked on GitHub. To make this copy, we must:

  • tell our Git to make a new, empty repository;
  • tell our Git to connect to the Git over on GitHub; and
  • tell our Git that we'd like every commit from that Git repository, into this new empty repository.

To do that, we can use git clone. The git clone command needs the URL by which our computer will call up some other computer over the Internet, and have that computer access a Git repository. In the case of some GitHub repository, the URL will be ssh://git@github.com/user/repo.git or https://github.com/user/repo.git. (We'll need to set up an SSH key, or provide a password, or otherwise let the GitHub machines know that we are in fact ourselves: it's up to GitHub to tell us how to authenticate. If we're using some other hosting provider, they will tell us how to authenticate. Again, there are plenty of places to find help about this, including GitHub's own help pages.)

Hence we might run:

git clone ssh://git@github.com/user/repo.git

(but with user and repo.git adjusted as appropriate).

What our Git will do with this command is:

  1. Create a new, empty directory in the current directory. The name of this new empty directory is derived from repo.git. If you want some other name, add it to the git clone command, e.g., git clone url new-directory. (You can use the name of an existing empty directory, although there was a bug in Git for a long time that makes this slightly inadvisable.5)
  2. Initialize this new empty directory as a Git repository.
  3. Save the URL under a remote name origin. Each Git repository can store any number of remotes, which are just short names for another Git repository. It's just convention to use the name origin as the remote for the repository you cloned. You can choose another name, but I will assume you don't, here.
  4. Do any other special configuration we ask for. (We didn't call for anything special, but if we had, this is the step where Git would do it.)
  5. Run git fetch origin: this calls up their Git at that URL, and gets from it, most or all of its commits.6
  6. Run git checkout to create a new local branch name, usually master, based on commits obtained in step 5.

At this point you will need to use your local computer's command—whatever that is; on the Mac, cd or chdir—to move into the Git repository that was created in step 2, in the previously-empty directory created in step 1.

You now have a clone of some other, existing Git repository. This clone has its own branch names. The branch names that were in the other repository exist in this clone as remote-tracking names.7 That is, if they had both master and feature, you now have the names origin/master and origin/feature.

A remote-tracking name is just your Git's way of remembering their Git's branch names. Remember, a branch name holds the hash ID of one commit—the last commit on the branch, from which Git works backwards to find earlier commits. It's very useful, at times, to know which commit hash IDs their Git has stored in their branch names. But your master is yours, not theirs. So your Git stores their master hash-ID pair in your name origin/master.

These remote-tracking names get updated when you run git fetch. Other than git fetch and git push—which we'll get into in just a moment—your Git runs on your computer and never talks to the network. All the stuff you need is in your own Git repository, because your Git got every commit from their Git.8


5If you interrupt the clone, or it fails, git clone will remove the partial, failed clone. The bug was that if you gave git clone an empty directory name, and the clone failed, Git would remove the empty directory. This is pretty harmless, but it's still surprising. It's fixed now, but some systems have really old versions of Git installed, and there is little reason to tempt fate, or Murphy's Law, here.

6Technically, git fetch can only get the commits that the other Git offers. But normally it offers all commits. For GitHub repositories, they also offer pull request commits, and your Git normally doesn't get those.

7Git calls these remote-tracking branch names, but I've decided that the word "branch" in this phrase is distracting and slightly harmful. It's up to you to decide for yourself is you agree. Be aware that people use the phrase "remote-tracking branch names" or even "remote branch", but you can't git checkout a remote-tracking name the way you can a real, local, branch name. What you get is instead what Git calls a detached HEAD.

8Despite the fact that every commit holds a full snapshot of every file, commits are actually very space-efficient. Git has several storage tricks here. The first and most obvious one is that since every commit is frozen for all time, any time any file in any commit has the same content as some other file in any other commit, the two commits can just share the compressed, frozen, Git-only version of that file. So if you have a million commits, but most of them keep re-using the same files every time, you don't have a million copies of the files. Some files might only exist in two or three versions across all million commits.


Transferring commits to another Git

We saw just a moment ago how git fetch has your Git call up some other Git and get from it, any commits they have, that your Git wants, that your Git doesn't already have. To do the same thing in the other direction—to have your Git call up their Git, but then give them commits—we use git push.

Besides the fact that git push sends commits to their Git, there is one other key difference here. Remember that Git finds commits by starting from some name and working backwards. Suppose your Git has this:

...--F--G--H   <-- master

and their Git has this:

...--F--G--H--I   <-- master

When you git fetch from their Git, your Git finds that their Git has one commit they have—commit I—that you want. So your Git gets that commit. But your Git now changes their name, master, to your remote-tracking name origin/master, so that you end up with:

...--F--G--H   <-- master
            \
             I   <-- origin/master

This doesn't work with git push.

Let's say we start with the above, then add a new commit J to our own master:

             J   <-- master
            /
...--F--G--H
            \
             I   <-- origin/master

There's nothing wrong with this picture, really—it's normal enough in Git—but suppose we now have our Git call up their Git and send them commit J. What they will have at this point is:

             J   [the commit we sent them]
            /
...--F--G--H
            \
             I   <-- master

How will they find commit J, when there is no name pointing to it? If we had them set a name like joey/master, that would do the job. But our git push is going to ask them to set their name master to point to commit J:

             J   <-- master
            /
...--F--G--H
            \
             I   ???

If they obey this request, they will "lose" commit I. So they simply won't obey this request! They will say: No, I won't change my master the way you asked, because I'd lose track of some of my own commits.

What we need to do at this point is come up with a better arrangement of commits.

In our Git repository, we can now:

  • copy existing commit J to a new and improved J', or
  • merge existing commits I and J to a new merge commit K.

To do these, we would use either git rebase or git merge. I'm going to skip all the details of how they work, and just show the final result.

If we use git merge, we get:

             J
            / \
...--F--G--H   K   <-- master
            \ /
             I   <-- origin/master

Commit K is slightly special. Instead of pointing back to one parent, it points back to two: both J and I. It still has a snapshot of all of our files as usual, but it remembers two previous commits for us. So now we can have our Git call up their Git and send commit K to them, then politely request that they set their master to point to commit K, just like ours does.

This time, that won't lose access to commit I—from K, Git will go back to both J and I, and then from both of those to H—so they'll accept, and we will end up with:

             J
            / \
...--F--G--H   K   <-- master, origin/master
            \ /
             I

They accepted our polite request to move their master, so our Git moves our origin/master to remember that their master also selects commit K. Our branches are now in sync.

Or, if we rebase our J by copying it to a new-and-improved commit J', we get this:

             J   [abandoned]
            /
...--F--G--H   J'  <-- master
            \ /
             I   <-- origin/master

We totally give up on our old commit J and just have our master remember J' instead. If we use git log, we never even see commit J. It's as though it vanished entirely. Our Git secretly keeps our original J for at least 30 days by default, in case we decide that this rebase was a bad idea, but it's not visible to most ordinary examination.

Since it's not visible, when we run git push origin master, our Git offers their Git commit J' only, and asks their Git to set their master to point to J'. This time, they have no objection—they won't lose commit I—so they do it, and we get, in our repository, this picture, assuming we don't keep drawing the original J:

...--F--G--H--I--J'  <-- master, origin/master

So, this covers (very lightly) when and how to use either git merge or git rebase to adjust things so that you can git push your commits to your own GitHub fork. We now need one last step, though.

Pull requests

Hosting providers like GitHub, GitLab, and Bitbucket all offer this idea of a fork, which is a kind of clone that adds another feature that they all call pull requests. Each of them implement this slightly differently internally, in ways that may or may not be visible if you look hard enough, but they all use enough common stuff to make experience with one provider carry over.

In particular, once you have done a git push from your laptop to your GitHub repository, your GitHub repository now has commits in it that are not in the repository you forked. You can now make a pull request to your friend. Using the GitHub web interface, you click on various clicky buttons they provide. They compare the commits in your fork to the commits in the other Git that you told them to fork earlier.

They do this comparing with the same graph-style operations we drew in the git push section above, but—in a misguided attempt to simplify it all away—they tend not to show you the actual graph. Since the actual graph—the connections from commit to commit—are the key to when and whether this stuff works, I think this is a huge disservice. Still, it mostly works—except when it doesn't, and then you have to add their GitHub repository to your laptop clone as a second remote.8

Anyway, with some luck, your pull request will turn out to be a simple case. Your friend can then use his computer to sign in to his GitHub account, accept your pull request, and hence copy your commits—the ones you sent to your GitHub fork from your laptop—into his original GitHub repository. Now he can download those commits into his Git, using git fetch on his laptop. His origin/master or origin/whatever remote-tracking name will adjust to point to the new commit in his GitHub copy that has the same hash ID and is the same commit you sent to your GitHub copy from your laptop.


8You'd probably have to do this anyway, even if they did show you the actual graph, but at least you'd know right away—and they would be able to explain what's needed. But they don't show this, and therefore cannot explain it either.


Conclusion: redundancy

This one commit you made, which stores the files you wanted to transfer, has now gone through four Git repositories: yours on your laptop, your fork of his GitHub repository, his GitHub repository, and his Git repository on your laptop. All of you share this commit, which has as its parent(s) some other shared commit(s), which have more parents, and so on—but except on the GitHub computers, which are physically together and therefore can share the underlying copy of the actual commit,9 there are actually many copies of this commit.

This distributed nature, in which every Git clone has a copy of every commit—well, every commit it has—which in turn has a copy of every file in that snapshot—is how Git provides redundancy and hence reliability. Note that one lone Git repository on one computer is subject to data loss, if that one computer fails. But with a repository that has dozens of clones, some on well-backed-up hosting providers, there's probably always some way to get most of your data back.


9Or, maybe they're not physically shared or adjacent. How and when GitHub share the data across repositories is up to GitHub.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Hello Torek, thank you for sending me the detail. I try to read the whole paragraph and challenge it. I think I did not understand the role of git. I will try to command from the top to the bottom sentence you sent. I really appreciate it!!! Thank you so much!!! –  Dec 22 '19 at 08:21