3

I am using lots of pdfs for my research and studies. I highlight the important statements and claims with color code, and often add text annotations to the pdf as well. I do all these at my work PC, at my home pc and also on my tablet. I am looking for a way to share new files, as well as share the newly added annotation to my other devices. The versioning, sharing and merging is not intended for the pdf text, but for the annotations. I was only able to find solutions where the text of the pdf was checked 1 (but even than, I am unsure about the merging).

I've got a small NAS, which I can use as a server. I was considering a version control server, such as git, for this task, but I have not find any way how git could check whether there was any change in the pdf and merge it. Even though pdfs are binary, but the annotations are there in plain text and I was able to see the differences in vimdiff. (Though diff and tkdiff did not help)

I have used my Dropbox account before, but I've reached the storage limit, and I do not intend to pay 200€/year if I have my own server available. Also I would like to be able to work off-line on the pdfs, therefore some kind of merging will be necessary. This is the same reason why I can't just work in one shared network drive.

This issue is a complex one, and I can imagine solutions at different abstraction level. I would be glad even for solutions which does not involve git or co.

Horror Vacui
  • 195
  • 8

1 Answers1

1

You can use something like NextCloud to replace Dropbox.

As for merging PDFs, this is generally not possible due to the - as you already noted - binary nature of PDF files. Even if some parts are visible with a text-based diff, this doesn't mean that merging such changes would create a valid and functioning PDF file.

Your only chance would be a tool that stores the annotations outside of the PDFs, in a merge-friendly text format.

gettalong
  • 735
  • 3
  • 10
  • Even though the file format is binary, it follows the PDF standards, where annotations are also defined. Based on that standard diff-ing and merging should be possible if the PDF reader follows this standard. The last hypothetical could be important, because even though the standard I've found is from 2008, many PDF readers even a few years before still saved their custom markings and annotations to a different file, whose format was specific to the given program. – Horror Vacui Nov 15 '20 at 09:58
  • 1
    If you use a PDF reader application that saves the annotations in a separate, mergeable file, then yes, merging would be possible. However, if the annotations are saved within the PDF, merging is **not** possible. One reason for this is that the annotations may not be stored as plain text but in a so called object stream which is binary and not mergeable. – gettalong Nov 15 '20 at 22:52