31

I have two GitHub repositories.

I'd like to automatically (probably using hooks and/or github API) commit and push files to the second repository when they are pushed to the first one.

The second repository is not a clone of the first one, their folder layout is not necessarily the same there is just a bunch of files in common.

What is the simplest way for doing this?

Bonus points if I don't have to install an http server or learn perl :)

Maytham Fahmi
  • 31,138
  • 14
  • 118
  • 137
Drax
  • 12,682
  • 7
  • 45
  • 85
  • 1
    Should the second repository be a clone of the first, or are you just trying to sync certain files? – ChrisGPT was on strike May 22 '15 at 18:35
  • @Chris Just trying to sync certain files – Drax May 26 '15 at 09:13
  • Normally I would suggest using submodules or subtrees for this, but that assumes that the files to be shared are contained in a dedicated subdirectory (perhaps something like `lib/foo/`). Is your codebase laid out that way, or could it be converted? – ChrisGPT was on strike May 26 '15 at 11:40
  • @Chris The main idea is for the users of the first repository not to have to do anything more than their usual commit/push, while still having the files correctly copied into another repository at another place. In real life the `first` repository is actually multiple repositories from which i want to extract a specific file and regroup them on one deployment repository. – Drax May 26 '15 at 12:47
  • You say "deployment repository". It sounds like you're trying to run a build step (e.g. perhaps minifying source code, concatenating JS files, compiling source files, etc.) using Git? – ChrisGPT was on strike May 26 '15 at 12:52
  • @Chris if you want details: i'm regrouping multiple build results of multiple sub project into one repository which contains the resource files for setups :) – Drax May 26 '15 at 12:55
  • I have a blue bike and a red car. I want to transfer the tachometer values from the bike to the car after I changed to the car and vice versa. Both tachos are not identical. How can I do it? (Bonus points for a solution which does not require to learn how things work) – hek2mgl May 26 '15 at 16:23
  • @hek2mgl kudos for making me discover what a tachometer is. I can find and implement complex solutions if necessary, but i just feel like this is a relatively simple action that should be solvable with a simple solution :) – Drax May 26 '15 at 16:31
  • Probably I failed to get the question right, but if then it is not a simple problem. You obviously need to define which files should getting mirrored (as far as I got you not all) and write some code which selects these files and mirrors them - in the right moment. (Mostly directly after a push).. – hek2mgl May 26 '15 at 16:37
  • @hek2mgl exactly and i assumed that a lot of people have already wanted to do that so something common might exist to do it (i even hoped it was a native feature of GitHub), but i might have overestimated humanity's features implementation rate :) – Drax May 26 '15 at 16:48
  • Wouldn't a simple local hook work for you? Let's say you create a file `.mirror`, define file names in there and use a simply bash script to upload those to whatever location? I also like this suggestion to put these files into a submodule. This would be more efficient since you would need to store the files only in one place. – hek2mgl May 26 '15 at 17:41
  • @hek2mgl the submodule was my first idea but that would add more complexity to the users of the original repository as they would now have 2 repositories to commit to each (one for shared files and one for the others) with a special folder layout. The main point of this is to avoid any more action for the initial committers in the first place :) – Drax May 27 '15 at 08:31
  • Is it true that the replication is only one-way, albeit many-to-one. Can each source repo only update its "own" files, or can there be collisions? Should the target files in the destination repo be considered read-only - what if there are changes and/or collisions - overwrite, merge, bail-out? – javabrett May 28 '15 at 05:26
  • @javabrett Only one way, yes. Each source repo only updates its "own" files. The target files are read-only indeed. Since the flow is one way it will always overwrite the destination files :) – Drax May 28 '15 at 08:13

6 Answers6

17

If you are looking for something robust and easy to maintain, I'd encourage you to develop a solution around GitHub Webhooks. Yes it will require you to deploy a HTTP server, say a Node.js server, and it will require a small amount of development (your requirements are fairly specific), but I think it will pay-off if you need something reliable and low-maintenance. That's if you decide this file-mirroring approach is still the right thing to do, having considered the approaches and the set-up effort.

Let the source repositories (on GitHub) be S1, S2 ... with (non-overlapping) file-sets to mirror F1, F2 ..., to be sent to a target repo T (also on GitHub), where the corresponding files are considered read-only. Your requirements are unusual in that Sn and T sound like they aren't cloned from each-other, they might not even have any common commit, in which case this is not a push/fetch scenario. You also haven't guaranteed that the source file-updates occur one-per-commit, or even grouped but isolated from non-replicating changes, so this is not about cherry-picking commits.

The trigger for replication is on push of certain files to S1, S2 ..., not a commit on any developer-clone of those repos, so client-side hooks won't help (and they might be awkward to maintain). GitHub doesn't allow generic hooks of course, so Webhooks are your best solution. You could consider another, polling clone which is regularly pulling from S1 ..., performing logic and then committing to T, but this sounds awkward compared to Webhooks, which will give you reliable delivery, replay capability, a decent audit-trail etc.

The upside is that there is a lot of already-built infrastructure to support this type of set-up, so the actual code you would have to write could be quite small. Say you go with a Node.js type setup:

  • Deploy github-webhook-handler. This cool little library is a pre-built handler for GitHub Webhooks, and handles the HMAC X-Hub-Signature verification and provides simple event-listener hooks for all the Webhooks events. You could have one end-point per S, or it's probably easier to fan them in.
  • Have some local file (keep it in a Git repo) which maps Sn to Fn.
  • Register a handler for X-GitHub-Event: push, and inspect repository/name and commits[]/modified[] for paths matching your local map.
  • Deploy node-github, an implementation of the GitHub APIv3 for Node.js.
  • For each matching file:

This approach allows you to do everything without needing a local clone of T. You might find it better to use a local clone, I'd see how easy things go with the API method first.

enter image description here

javabrett
  • 7,020
  • 4
  • 51
  • 73
  • Nice answer, if nothing better comes before the bounty ends i'll accept this one. `That's if you decide this file-mirroring approach is still the right thing to do, having considered the approaches and the set-up effort.` This is very relevant :) – Drax May 29 '15 at 07:44
  • @javabrett, I know this is a pretty old thread, but stumbled on this while trying to get a solve for a similar problem we have. What you say looks good, but the only issue i see is with the failed delivery of the events. Say for some reason the webhook isn't able to contact the server, how can we ensure the redelivery of the event. I know there's an option in GitHub to redeliver manually but probably isn't scalable when working with multiple repo's. Do you know if there is a API to frequently poll for the events which failed and then call another API to redeliver? – user320550 Jun 12 '19 at 18:40
10

We had a similar problem - we wanted to automatically copy documentation files between project's and common documentation's repositories. We've built tool that listens to GitHub's webhooks, parses commits and creates Pull Request to a chosen destination. Copycat schema We've open sourced it - https://github.com/livechat/copycat - it can be used on any node platform server.

Drax
  • 12,682
  • 7
  • 45
  • 85
konradk
  • 226
  • 2
  • 3
7

EDIT: I now realize that the question was about GitHub. My answer is about a standard git repository that you have file access to.

I am assuming the second repo is a clone of the first, created something like this

git clone --bare first.git second.git

Change current directory to inside the first.git repository, and add the second.git as a remote.

cd first.git
git remote add second ../second.git

Then, create a file in the folder first.git/hooks/ named post-receive (you can rename the post-receive.sample file already there)

The content should be this

#!/bin/sh
git push second

Now, when you push new commits to the first repository a push from first to second will be executed immediately, so that also second receives the commits.

Klas Mellbourn
  • 42,571
  • 24
  • 140
  • 158
  • Thanks for the answer, but the second repository is not a clone of the first one, i'll edit the question to add that precision – Drax May 26 '15 at 09:19
6

Two GitHub repos alone (without a third-party server listening for webhook events) cannot mirror each other.

You would need to register a webhook on one GitHub repo in order to detect the push event, and push to the second GitHub repo.

That means having a server which listen for the webhook json payload.

A tool like dustin/gitmirror can help (in Go).

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
-1

As you have different repo, you can try to apply the commits one by one using git-apply/git-am and then push.

Assuming you have Repo1.git and Repo2 on a server, Repo1.git is the bare repository, Repo2 is a local clone of you're second repository.

Repo1/.git/hooks/post-receive

#!/bin/sh
t=$(mktemp)
repo2_directory=/some/place/you/cloned/repo2
error=
while read line; do
  ref1=$(echo "$line"|cut -d' ' -f1)
  ref2=$(echo "$line"|cut -d' ' -f2)
  for ref in $(git log --oneline $ref1..$ref2); do
    git show -p --no-color --binary $ref > $t
    if !(cd $repo2_directory && git am -q < $t || (git am --abort; false)); then
      echo "Cannot apply $ref" >&2
      error=1
      break
    fi
  done
  [ -n "$error" ] && break
done
rm -f $t
[ -z "$error" ] && (cd $repo2_directory && git push)
Cyrille Pontvieux
  • 2,356
  • 1
  • 21
  • 29
-2

A simple way to do this is to add two (or more) pushurls to origin (or some other remote).

For example:

git remote set-url --add --push origin url1
git remote set-url --add --push origin url2

It doesn't change a whole lot to anyone's workflow, and but all pushes are still effectively duplicated for both repos. It is explained in much more detail here.

If you have a lot of people working on the same repo and want their changes reflected, try running a script to assign the new pushurls for each developer. Otherwise, I'm afraid you'll need to use hooks + server.

Community
  • 1
  • 1
Sze-Hung Daniel Tsui
  • 2,282
  • 13
  • 20