0

I am considering using git for version control of a project which is not source code. In this project, there is an xml file which stores objects with unique IDs. I am afraid that if two branches both add an object of the same type, these new objects would have the same IDs and in merging these IDs would clash.

Is there a way to avoid this ID clashing in a relatively straightforward way?

Edit: Here is the workflow I am thinking:

  1. External software is used to develop an xml file. A sample of this XML file is below.
  2. The xml file (and complimentary files) are in a folder with which git init is called.
  3. This git repo is pushed to a remote repository.
  4. A variety of team members fork/branch off of the remote repository on their local devices. They make edits, including adding objects. The external software automatically assigns object ids which are 1 more than the max ID (of that object type).
  5. Team members commit changes locally, and eventually push back to the remote repo to merge into the master branch.

  6. How do I manage this merge to prevent the clashing of two object IDs within the XML file if two users added the same type of object?

File example:

<net>
<ObjectAs>
<ObjectA id=1 some=thing>
<ObjectA id=2 some=else>
</ObjectAs>
<ObjectBs>
<ObjectB id=1 some=thing>
<ObjectB id=2 some=else>
</ObjectBs>
</net>

Possible commit from editor A:

<net>
<ObjectAs>
<ObjectA id=1 some=thing>
<ObjectA id=2 some=else>
+++<ObjectA id=2 some=1>
</ObjectAs>
<ObjectBs>
<ObjectB id=1 some=thing>
<ObjectB id=2 some=else>
</ObjectBs>
</net>

Possible commit from editor B:

<net>
<ObjectAs>
<ObjectA id=1 some=thing>
<ObjectA id=2 some=else>
+++<ObjectA id=2 some=2>
</ObjectAs>
<ObjectBs>
<ObjectB id=1 some=thing>
<ObjectB id=2 some=else>
</ObjectBs>
</net>

Clearly these edits would conflict by adding two ObjectA objects with the same id. However, if I understand git correctly it would happily merge these together and add both rows.

How do I manage this merge to prevent the clashing of two object IDs within the XML file if two users added the same type of object?

Machavity
  • 30,841
  • 27
  • 92
  • 100
user1558604
  • 947
  • 6
  • 20
  • all files(blobs) stored in git have a unique ID (sha1) . Multiple branches can reference the same blob with the same id. – Serge Feb 06 '20 at 14:10
  • Thanks @Serge. I am asking about IDs within a file. The XML file has objects with unique ids. – user1558604 Feb 06 '20 at 14:25
  • Where is this xml file coming from? What stores the objects ids in it? Apparently it has nothing to do with git. – Serge Feb 06 '20 at 14:59
  • @Serge: The xml file comes from a software program I do not control. I would like to manage this file (and others) within a git repo, but need a way to prevent the id clashing. – user1558604 Feb 06 '20 at 18:12
  • Well, unfortunately your flow is not clear. it is difficult to say what it has to do with git and how git could affect your flows. You need to provide more details. – Serge Feb 06 '20 at 19:00
  • @Serge: I have added a lot of detail to hopefully explain better what I am asking. Not sure why you don't see where the git connection is, I am asking **how to manage conflicting object ids within a file while completing a git merge.** – user1558604 Feb 06 '20 at 20:14
  • it sounds like the xml file is generated by some third-party software. If it is automaticlly generated, what is the reason for keeiping it in a version control system at all? – Serge Feb 07 '20 at 01:48
  • @Serge a desire to keep track of changes over time. – user1558604 Feb 07 '20 at 18:57

1 Answers1

1

git merge works on text files. If the text combines well according to Git's text-combining rules (see, e.g., my answer to Repetitive merges in GIT. How does it calculate differences?), Git will combine them. If not, it will not.

Git does not understand XML. If you wish to merge XML files intelligently, you must provide your own merge algorithm. Merging XML intelligently is a very hard problem—in fact, without a schema or some such, it is impossible in general.

You can have Git run your own merge algorithm automatically, using .gitattributes and the merge driver setting. Note that Git won't invoke your merge algorithm if only one side changes the XML file; it will use it only when both sides have changed that file. (More specifically, only if both "sides" of the diff, ours and theirs, have low-level changes, will Git invoke your driver. Again, see the linked question.)

torek
  • 448,244
  • 59
  • 642
  • 775
  • When does the merge conflict check occur, and could I just replace that piece with a custom script? – user1558604 Feb 07 '20 at 00:25
  • 1
    The low level conflict occurs when Git runs its low level merge driver. If you set a merge driver, Git runs your merge driver instead of its own. Git's built in low level merge driver is available as a standalone program called `git merge-file`. – torek Feb 07 '20 at 00:33