6

I have one directory called 'projects' which is the parent directory, within that near by 200 sub-directories which are my projects.

For now I am executting git pull by following script.

#!/bin/bash  
find . -type d -name .git -exec sh -c "cd \"{}\"/../ && pwd && git pull && echo -e '-------------------- \n ' " \;

Is there any efficient way I can do this process in multithreading and faster way?

2 Answers2

3

All sub-directory are not having same git repository and also its not submodules. So for now I am solving this problem by xargs which is below.

#!/bin/bash
find . -type d -name '.git' -print0 | xargs -P 40 -n 1 -0 -I '{}' sh -c "cd \"{}\"/../ && git pull && pwd && echo -e '-------------------- \n ' " \;
  • find . - Start find from current working directory (recursively by default)
  • -type d -name '.git' - Finding all directories having .git directory as sub-directory.
  • -print0 - List of directories as input to xargs

I also found some good help at http://coldattic.info/shvedsky/pro/blogs/a-foo-walks-into-a-bar/posts/7

1

Note that if your nested repos were declared as submodules, then a simple git submodule update --remote would be enough.

That is, provided you had your submodules configured to follow a branch.
See also "Git submodule to track remote branch".

Those updates (involving a pull) would not be multithreaded though (both for the checkout part, but for the fetch part as well.

The multi-threading is only for one operation, as mentioned in this thread:

A few selected operations are multi-threaded if you compile with thread support (i.e., do not set NO_PTHREADS when you build).

But object packing (used during fetch/push, and during git-gc) is multi-threaded (at least the delta compression portion of it is).

git may fork to perform certain asynchronous operations.
E.g., during a fetch, one process runs pack-objects to create the output, and the other speaks the git protocol, mostly just passing through the output to the client.
On systems with threads, some of these operations are performed using a thread rather than fork.
This is not about CPU performance, but about keeping the code simple (and cannot be controlled with config).


All that means, as Etan Reisner comments, that you would need to script those git pull updates yourself in order to multithread those commands.

See "Multithreading in Bash" for scripting solution.

Community
  • 1
  • 1
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • While this contains a fair bit of generally useful information it doesn't actually deal with the situation in the OP's question at all or offer anything by way of a solution to the OP's problem. – Etan Reisner Apr 07 '15 at 12:32
  • @EtanReisner because as far as I know there is no git-native solution: multiple git pull are not multithreaded by default. – VonC Apr 07 '15 at 12:33
  • His repositories are unrelated (it would seem). You don't need a git-native solution. You postulated a situation that doesn't exist and which painted yourself into a problem that needn't exist. – Etan Reisner Apr 07 '15 at 12:35