Are there tools or a set of commands designed for this purpose?
I am quite sure the answer here is no because it would require interpreting the meaning/content of the code.
- What if one of the functions you are tracking is renamed?
- What if one of the functions you are tracking is renamed and also changed a little in the same commit?
- What if one of the functions you are tracking is renamed and also changed a lot in the same commit?
- At what point does a large amount of changing the content of a functions change it into something different?
- What if one of the functions you are tracking is merged with another function?
- What if one of the functions you are tracking is split up into multiple functions?
- Etc
It is the same domain as saying a line is changed which no (sane) tools will do, they only say lines added and lines deleted separately.
Having said that a version control system (e.g. git) is the only viable tool to assist in doing this. For some scenarios it might fail, but with a proper structure it will probably be able to do this mostly automatic (really dependent on the amount of changes in the upstream project).
What I would do is the following:
Step 1
I am assuming that you are working on a clone of the upstream repository and the branch you want to pick from is main
. This clone repo could then be a submodule in your other repository (not covered further in this answer).
git clone https://git.example.com/some-repo.git
cd some-repo
git checkout main
Step 2
Then run git branch main.extract main
to make your own main.extract
branch where you remove the stuff you do not want, but not just all in one operation.
Assuming the upstream file contains the following (and you are only interested in function1
and function4
):
import { whatever } from './whatever';
import { somethingelse } from './something/else';
export function1(a: string) {
...
}
export function2() {
...
}
export function3(a: SomeClass, b: number) {
...
}
export function4(a: string[]) {
...
}
The first step is to just add some "marker lines" to help separate the parts you want and part you do not want, and this should be done alone as a separate commit. While adding one line with // Begin save this function
before and one line with // End save this function
after will accomplish this you are much better off using more than one line in order to create a stronger barrier to changes near above/below those lines. I am showing with only three lines below, but in real life you should use 5-6 lines.
import { whatever } from './whatever';
import { somethingelse } from './something/else';
// Begin save function1
// Begin save function1
// Begin save function1
export function1(a: string) {
...
}
// End save function1
// End save function1
// End save function1
export function2() {
...
}
export function3(a: SomeClass, b: number) {
...
}
// Begin save function4
// Begin save function4
// Begin save function4
export function4(a: string[]) {
...
}
// End save function4
// End save function4
// End save function4
When incorporating upstream changes later on it is not unlikely that you will get conflicts, but resolving those for this commit will be trivial. Even in case of say one of the functions being moved to somewhere else, moving the separation comments correspondingly is easy.
As always, small commits that does one and one thing only is the key to having a pleasant version control experience and reducing conflicts.
git add upsteamfile.ts
git commit -m "Added separation comment lines"
Step 3
Now with this in place the next step is to (only) remove the upstream parts that you are not interested in:
import { whatever } from './whatever';
// Begin save function1
// Begin save function1
// Begin save function1
export function1(a: string) {
...
}
// End save function1
// End save function1
// End save function1
// Begin save function4
// Begin save function4
// Begin save function4
export function4(a: string[]) {
...
}
// End save function4
// End save function4
// End save function4
Check this in as a separate commit. Later on when updating to later upstream versions you might get conflicts (say a new function5 is added), but again this is trivial to resolve. Maybe you could use the -s ort -Xours
(notice not -s ours
!) merge strategy (for this commit only!) but I have no experience using such merge strategies, I have always used KDiff3 and would strongly recommend doing the same.
git add upsteamfile.ts
git commit -m "Removed unwanted parts"
Step 4
Now with the unwanted parts gone, you can remove the separation comments (trivial edit, I am not showing an example of this).
git add upsteamfile.ts
git commit -m "Removed separation comments"
Step 5
If you have any additional modifications you want to the functions then apply them here.
${EDITOR:-nano} upsteamfile.ts
git add upsteamfile.ts
git commit -m "My own customization of function1"
${EDITOR:-nano} upsteamfile.ts
git add upsteamfile.ts
git commit -m "My own customization of function4"
Or of course split up in multiple commits if you do multiple modifications.
Future updates
So far all the above has been initial setup. But the main part of the question is what about later changes. So let's consider those. Assuming you started with upstream release v1.0.0 and you want to update to release v2.0.0. With the steps from above in place there is only one or two things to do.
- Create a branch/tag to keep a reference to the old version (optional).
git tag main.extract-v1.0 main.extract
- Then fetch the new upstream and rebase.
git checkout main
git pull --prune
git rebase main main.extract
That's it. You might get conflicts on the rebase, but except for the (optional) last commit with your extra modifications to the original source, all the other commits should be literally trivial to resolve.
And conflicts on extra modifications to the original source is inherently unavoidable, so this is as good as it possible can be.