9

We use .docx and .odt for our "human-centric" documentation, but these formats are pretty much the worst you can do to a git repository.

Is there some git-friendly format that offers basic word-processor functionality and contains everything in one file?

Robby75
  • 3,285
  • 6
  • 33
  • 52
  • you could use Word Xml which is an easy conversion from docx, its basically the docx structure but with the wrapping done in xml not a zip file. – tolanj Jun 15 '16 at 16:05
  • Txt? Or maybe a pdf? – Tristan Jun 15 '16 at 16:06
  • but i would go html and a html editor to be honest – tolanj Jun 15 '16 at 16:06
  • @Tristan T => how would pdf be better? – tolanj Jun 15 '16 at 16:06
  • 6
    Markdown. is the most common one and very easy to maintain, just like in this site. – CodeWizard Jun 15 '16 at 16:10
  • HTML and Markdown cannot hold pictures, thus do not contain everything in one file. – Robby75 Jun 15 '16 at 17:10
  • Well, a plain text format with support for pictures is [LaTeX](http://tex.stackexchange.com/) which is sure not a single file but this, in itself, I don't consider to be a problem. Anyway, if you're used to word processors, this might not be an option. – kostix Jun 15 '16 at 18:34
  • 1
    Othewrise I'm with @tolanj for Word Xml -- but with a caveat: any *machine-generated* format is a bad fit for diffing: a small change in a document might result in swaths of XML gobbledygook changed in the file; that's OK for the parser which loads the file but not to a human. And if you're not concerned with diffing, then just store your wordprocessor documents "as is": they indeed won't be stored too efficiently but I doubt it will be a major problem unless your files are on the scale of several MiBs each and change often. – kostix Jun 15 '16 at 18:36
  • Definitely avoid WYSIWYG-only formats. Markdown fan here. ASCIIdoc is nice, too. – Raffael Jun 15 '16 at 18:47
  • 1
    You can get git to understand some of docx/odt formats, see my answer here: http://stackoverflow.com/a/17106035/1615903 – 1615903 Jun 16 '16 at 03:57

4 Answers4

5

There are many formats that are text friendly. For example:

  • Markdown, HTML, and XML as already indicated in the comments. These files can't contain images on their own, but you can put a reference to an image (for example in the same directory or in a resource subdirectory, such as [GitHub Logo](/images/logo.png) with markdown or <img src="images/logo.png"> in html). It's not so handy as with copy/paste in a docx or odt, but it's git friendly, especially if the pictures don't change too often
  • Rich Text Format (RTF) is supported by many word processing packages. It allows embedded pictures and is stored in a text friendly format (the binary pictures are embedded in a text encoding).
Christophe
  • 68,716
  • 7
  • 72
  • 138
1

Short: use *.odt (LibreOffice Writer) and ReZipDoc (GPLv3) (disclaimer: I maintain it)


Explanation:

Quite some binary formats - among them docx (MS Word) and odt (LibreOffice Writer) - are just ZIP files that contain text- and binary-files. Using git filters, you can re-zip these without compression, which makes them much easier to compress for git, saving a lot of space in git history. This also leaves them in quite diff friendly format, without using an extra diff workflow. Still, most editing software has no problem using these files instead of the compressed ones. The main downside: Each person working on the repo has to install the filter(s)

The ReZipDoc (GPLv3) tool is made for this workflow; it contains a git filter.

The more and bigger binary files (like images) you use within your documents, and the less often they change, the more space you will save each time the text parts of your documents change, compared to not using a filter like this.

I must say though, that technically speaking, and also personally, I would also recommend Markdown over such a solution. There are already nice GUI editors available for it.

hoijui
  • 3,615
  • 2
  • 33
  • 41
1

For documentation, AsciiDoc is suited far better than Markdown. It looks similar at first glance, it has the same basic idea, that the source code mimics the final look, but it has all the rich text features that Markdown only achieves with a large number of inofficial, incompatible language extensions. It is also widely supported, e.g. by Github, Bitbucket, and via add-ons by editors like Atom and VS Code.

Felix Dombek
  • 13,664
  • 17
  • 79
  • 131
1

I would suggest using .fodt (Flat XML Open Document Text). These files are saved as uncompressed XML text. Within the editing software, it functions exactly like a .odt file. And they can be opened and read in your favourite Office software.

Though saving as .odt will compress this XML significantly, depending on the Office software you use (I use LibreOffice), .odt can ironically yield a larger file because a thumbnail is generated and stored within the file. This thumbnail will usually make up the majority of the file's size.

Diriector_Doc
  • 582
  • 1
  • 12
  • 28