0

I'm taking Harvard's CS50-Web course and I'm trying to turn in my Capstone project. It is required that we push our project to the following exact branch:

web50/projects/2020/x/capstone

I have been pushing my code to this branch whenever I make changes to it. Recently, I realized I have been pushing my venv folder when I shouldn't be, so I added this to .gitignore, but the folder wasn't removed from the branch. So, I deleted the folder manually via GitHub's web interface. Now, whenever I try to push my project to this branch, it gets rejected because my local files are out of sync with the remote ones.

I thought I would just delete the branch and start anew, but it turns out web50/projects/2020/x/capstone is the default branch (I'm not sure how it got to be that way) and I am only able to delete branches that are not default. I see no way on GitHub to change the default branch. I can't access settings for the repo since it's not my repo.

I could just push all my files to a different branch, but the assignment has to be turned in at that exact branch (web50/projects/2020/x/capstone), otherwise it will not be graded.

I am new to git, if it's not obvious. Is there a way to change the default branch of a repo that isn't mine?

torek
  • 448,244
  • 59
  • 642
  • 775
ragmats
  • 91
  • 8
  • 2
    Well, you can do some combination of `push` and `pull` until they get in sync. You might have to restore the files of the venv if it accidentally deletes them. – Tim Roberts Nov 03 '22 at 03:37

2 Answers2

0

First, update all origin/ refs to latest:

git fetch --all

Backup your current branch:

git branch backup-branch

Jump to the latest commit on origin/web50/projects/2020/x/capstone and checkout those files:

git reset --hard origin/web50/projects/2020/x/capstone

From here you should be able to merge your backup branch with your web50/projects/2020/x/capstone branch and push new changes up.

Check out this thread: https://stackoverflow.com/a/8888015/2255327

lexabu
  • 11
  • 3
  • Will this risk changing/deleting any of my local files? To be clear, the current files are my local ones that I want force pushed to the remote branch, but I cannot do this because I don't have the permissions. [git push --force origin master:web50/projects/2020/x/capstone] gives me: remote: error: Cannot force-push to this protected branch, ! [remote rejected] master -> web50/projects/2020/x/capstone (protected branch hook declined) – ragmats Nov 03 '22 at 06:37
0

TL;DR

Save any files you can't easily re-generate,1 somewhere outside your existing working tree. Then git fetch from the GitHub repository and merge or rebase. This may delete some of the files. That's OK because you already saved them where Git can't wreck them, so now you can put them back and continue using them.


1Except for a requirements.txt and the like, pretty much everything in .venv should be trivial to re-generate.


Long

To answer the question in the title and here:

... it turns out web50/projects/2020/x/capstone is the default branch (I'm not sure how it got to be that way) and I am only able to delete branches that are not default. I see no way on GitHub to change the default branch. I can't access settings for the repo since it's not my repo.

The owner of the repository on GitHub—or more precisely, anyone with "admin access"—is the person who can set the default branch. See this GitHub documentation page for details.

To answer the question you probably should have asked is a bit more involved:

Recently, I realized I have been pushing my venv folder when I shouldn't be, so I added this to .gitignore, but the folder wasn't removed from the branch. So, I deleted the folder manually via GitHub's web interface. Now, whenever I try to push my project to this branch, it gets rejected because my local files are out of sync with the remote ones.

Harvard should probably have a mini-course on source control here as there are several things you might need to un-learn at this point. Some of what happens here in Git is necessarily difficult, because Git is in effect a sort of distributed database that chooses Availability and Partition tolerance over Consistency (see Wikipedia article on Brewer's Theorem). Some other things are difficult because Git is just mean, perhaps. And then we add GitHub to the mix, and things get really complicated.

Anyway, you're right that you have introduced here a "consistency" problem, making things out of sync. But it's not files that are inconsistent. Git doesn't work—at least at this level—on individual files, but rather on commits. You don't push a file: you push a commit. You don't push a folder—in fact, Git doesn't store folders in the first place; see below—but rather commits (or a single commit).

The commit is, in effect, the basic unit of storage in a Git repository. It's therefore crucial that you understand exactly what a Git commit is and does for you. Because this isn't what you asked (and the answer would get very long), I won't go into all the details here, but I will say that each commit stores a full snapshot of all files, in a special, weird, Git-ized, compressed and de-duplicated form. Each commit is then numbered with a unique (across all commits in every Git repository everywhere) hash ID. Once you've made some commit, every part of it is frozen forever: that hash ID now means that commit (in every repository, even those that don't have that commit) and because that commit will be handed over to some other repository if and when you have two different Git databases meet up, and everyone has to agree on the numbers, nothing here can change ever.

Git makes the commit-snapshots from what is in Git's index. This "index" thingy is so important, and/or so badly named, that it has two other names: Git also calls it the staging area, which refers to how you use it, and (rarely now) the cache. You mostly see that last name in things like git rm --cached, which removes a file from Git's index without also removing it from your working tree.

You can't see what's in a commit—at least, not directly or easily. And the stuff in the commit is frozen for all time, which means it's great as an archive, but quite useless for getting any actual new work done. The result of this is that to get any actual work done, Git must extract some commit into an area where you have ordinary, non-Git-ized files. Git calls this work area your working tree or work-tree.

It's important to realize that the files in your working tree are not in Git. They may well just now have come out of Git (out of a commit), but because they're ordinary files in an ordinary folder—and Git doesn't "do" folders, at least not in its index from which it makes commits—they're not Git files. They're just ordinary files.

The presence of the index / staging-area is why Git makes you use git add all the time. The command git add path/to/file has Git open and read the file named file in the folder named to in the folder named path in your working tree. That's an ordinary file. Git then compresses its content down to the internal format that Git will use in the next commit, and makes sure that this content is ready to go, in the form of a Git-zed file whose name is now path/to/file, complete with (forward) slashes (even if your OS uses backwards ones). Git stores (indirectly) the compressed, de-duplicated, Git-ized data in the index, ready to be committed.

When you run git commit, Git simply packages up whatever is in its index at that time. This becomes the snapshot for the new commit you're about to make. (The metadata, which we haven't discussed here, is now constructed on-the-fly.) So the index acts as an area where you can, well, "stage" some files for committing: hence the name "staging area".2 Files you haven't "staged" are still there on the existing stage though.

This is where things go wrong with just adding file names to .gitignore. A file that's in Git's index is in Git's index, regardless of whether its name, or some prefix like .venv/, appears in .gitignore. This means git commit will include it. To remove the file from the next commit, you must literally remove the file from Git's index, and here your fear is justified:

Will this risk changing/deleting any of my local files?

Using git rm to remove a file tells Git: remove this file from both your index and my working tree. Having been removed from both places, it's gone now, and won't be in the next commit.

This is why git rm --cached exists: that tells Git remove the file from your index, but leave my working-tree copy alone. Now the file is gone from the index and won't be in the next commit—and now listing the file, or a prefix like a folder name for it, will prevent the file from going back into Git's index with a plain git add.

Unfortunately, this is not nearly the end of the story. Fortunately, existing commits literally can't be changed, and each of those commits has a full copy of every file. So if you accidentally remove, say, venv/important-data from Git's index and the equivalent file from your working tree and want your important data back, you can get it—or at least, the old frozen version of it—from any older commit that does have it.

To get the old frozen version of some file back, you can:

  • check out an entire old commit (using Git's "detached HEAD" mode): this populates Git's index, and your working tree, from the old commit, and now the file is back; or
  • extract a single file from an old commit using git restore:3 this lets you control whether a copy of the file goes into Git's index.

Confusingly, once the file is back in Git's index (if you git switch --detach HEAD@{yesterday} for instance), if you then switch back to a later commit that omits the file, Git now removes the file (from Git's index and from your working tree). This is a natural consequence of a commit check-out operation, which means extract the given commit, which in turn means make Git's index and my working tree match the given commit, which requires creating and/or removing various files in your working tree.


2The index takes on an expanded role during merge operations, especially when conflicts arise. That's the main user-oriented reason to call it "the index" rather than "the staging area": sometimes it has three copies of each file instead of just one. But in ordinary times, the word "index" and the phrase "staging area" are pretty much interchangeable.

3In Git versions predating 2.23, you have to use git checkout for this operation. That's a different kind of check-out than a full-commit check-out, and this old-style git checkout always "writes through" the index, in the style of a write-through cache in computer hardware designs. The new restore command is clearly better here.


Merge or rebase

To get where you need to be, you will need to merge or rebase your existing commits to / with / on top of the new commit you made through the GitHub web interface.

The git merge command is fundamentally much simpler, and is the one to describe here. But for space reasons I won't go into any actual detail: see other StackOverflow Q&A for details.

After:

git fetch origin

you will have an updated origin/web50/projects/2020/x/capstone name in your own (local) Git repository. This name finds the latest commit in a "branch", where branch means series of commits ending at a particular latest commit. See also What exactly do we mean by "branch"?

Meanwhile, you have your own latest commit, ending at whatever commit you have now, as your latest commit on your current branch, whatever its name is (master?). This branch name isn't really important to Git at all: what matters is the commit. The branch name is just a device by which you don't have to memorize big ugly random-looking Git hash IDs.

To combine your own latest commit with the latest one from GitHub, you would now run:

git merge origin/web50/projects/2020/x/capstone

This uses the name origin/web50/projects/2020/x/capstone to locate the latest GitHub commit, which you just got via git fetch. This uses whatever branch name you're "on" right now—as in, git status says on branch master or whatever—to locate your own latest commit. It then uses the historical commits "behind" each of these two latest commits to figure out where, in the past, your repository and the GitHub repository were last in sync. This is a whole series of commits, ending at whatever is the "best" (latest) shared commit, which is easily seen as commit * in this drawing:

             o--o   <-- master (HEAD)
            /
...--o--o--*
            \
             o   <-- origin/web50/projects/2020/x/capstone

(newer commits, here, are towards the right, and each commit is represented as an o or *; these link backwards, through commit metadata, to their parent commits).

Having obtained these three commits' hash IDs, Git begins the process of merging (as a verb) the work done since that shared common starting point. Git will see that on the GitHub side (the one commit on the bottom row), the "work done" involves removing some files. If you did not modify those files in your commits (along the top row), Git will combine "their" (your, really) removal of these files with your own "do nothing to these files" to get "remove these files" as the final result.

So this git merge operation will remove, from your index and working tree, the files that you removed via the web interface. It will be as if you ran git rm on each such file.

If all goes well—and it probably will—as far as Git itself is concerned, Git will now commit the result, producing:

             o--o
            /    \
...--o--o--*   ,--M   <-- master (HEAD)
            \ /
             o   <-- origin/web50/projects/2020/x/capstone

Note how the new merge commit M "points back" to both your previous tip commit (the rightmost top row o) and "their" tip commit (bottom row o). So commit M adds on to the all-commits databases in both repositories.

This means you can now git push origin master:web50/projects/2020/x/capstone to send commit M to GitHub, and the two separate repository databases are once again consistent.

Note: if you rename your local masterto web50/projects/2020/x/capstone—which you can do at any time—and set its upstream to origin/web50/projects/2020/x/capstone:

git branch -m web50/projects/2020/x/capstone
git branch --set-upstream-to=origin/web50/projects/2020/x/capstone

you'll gain the ability to run git push with no arguments, and git status will print a bit more information. This doesn't do anything you couldn't do before, it just makes Git more pleasant to use (for most people anyway).

torek
  • 448,244
  • 59
  • 642
  • 775