75

I am in charge of about 100+ documents (word document, not source code) that needs revision by different people in my department. Currently all the documents are in a shared folder where they will retrieve, revise and save back into the folder.

What I am doing now is looking up the "date modified" in the shared folder, opened up recent modified documents and use the "Track Change" function in MS Word to apply the changes. I find this a bit tedious.

So will it be better and easier if I commit this in a version control database?

Basically I want to keep different version of a file.


What have I learn from answers:
  • Use Time Machine to save different version (or Shadow copy in Vista)

  • There is a difference between text and binary documents when you use version control app. (I didn't know that)

  • Diff won't work on binary files

  • A notification system (ie email) for revision is great

  • Google Docs revision feature.

Update :

I played around with Google Docs revision feature and feel that it is almost right for me. Just a bit annoyed with the too frequent versioning (autosaving).

But what feels right for me doesn't mean it feels right for my dept. Will they be okay with saving all these documents with Google?

qwertyuu
  • 1,055
  • 1
  • 10
  • 14
  • Very good question ... we have hundreds of documents laying in network shares.. I want to make my organization move to a Subversion document storage. – lb. May 18 '10 at 00:01
  • MagnetSVN is a Subversion client for Microsoft Office 2007-2013 http://magnetsvn.com – Eugenek Nov 10 '13 at 12:54

20 Answers20

60

I've worked with Word documents in SVN. With TortoiseSVN, you can easily diff Word documents (between working copy and repository, or between two repository revisions). It's really slick and definitely recommended.

The other thing to do if you're using Word documents in SVN is to add the svn:needs-lock property to the Word documents. This will prevent two people from trying to edit the same document at the same time, since unfortunately there's no good way to merge Word documents.

With the above two things, handling revision controlled Word documents is at least tolerable. It certainly beats the alternative of using a shared folder and track-changes.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • 5
    Can you really diff the *content* of Word documents using TortoiseSVN? Not just "binary files differ" kind of diff. (SVN itself certainly doesn't provide more than that.) – Jonik Jul 29 '09 at 10:55
  • 16
    Yes, Tortoise has some VBScript helper scripts that load both old and new documents into Word and use Word's document diff features to show the differences. It works pretty well, actually. – Greg Hewgill Jul 29 '09 at 10:57
  • 1
    It sounds handy; thanks for clarifying. (This came up on a Super User question: http://superuser.com/questions/14894/diff-software-for-word-files) – Jonik Jul 29 '09 at 12:10
  • This seems to work great for .doc and .docx files, but doesn't work at all for .dot or .dotx files (Word template files). For those files, TortoiseSVN just says they are not valid text files so it can't diff them. I tried manually saving the old & new versions and comparing them using Word (Review->Compare->Compare Documents), and Word does do automatic comparisons of .dot files, so this must just be an oversight in TortoiseSVN (at least in version 1.8.8). Know of any way to add .dot and .dotx to the list of extensions TortoiseSVN will do its VBScript magic on in order to diff them in Word? – phonetagger Jan 31 '15 at 16:31
  • I tried editing the top line of C:\Program Files\TortoiseSVN\Diff-Scripts\diff-doc.js to include .dot and .dotx extensions, and that didn't solve the problem. Thinking I might have to reboot for it to take effect, I tried that, and it still didn't work; same issue: TortoiseSVN tries to do its own diff, and complains that they're not valid text files. – phonetagger Jan 31 '15 at 17:28
  • @GregHewgill does it works on .xls too? – Mikhail_Sam Aug 14 '17 at 07:04
38

What on Earth are you all Word-is-binary-so-no-diff people talking about? TortoiseSVN, for example, integrates right out of the box with Word and enables you to use Word's built-in diff and merge functionality. It works just fine.

I have worked on projects that store documents in version control. It has worked out pretty well, although if people are unfamiliar with version control, they are probably going to have conceptual difficulties with things like "working copy" and "merge" and "conflict". Don't overestimate the users' capabilities when you plan your document management system.

I believe there exist big and powerful commercial solutions for all of this, as well. I'm sure if you have enough kilodollars, you can get something that fits your needs perfectly. Document management systems are a big business for big enterprise.

Sander
  • 25,685
  • 3
  • 53
  • 85
  • 3
    +1 I did not knew it was possible, but you're definitely right. TortoiseSVN can diff et merge word documents using words functionality. – Mathieu Pagé Jun 23 '09 at 18:38
19

I guess one thing that nobody seems to have asked is if you have a legal requirement to store history of changes to the doc's?

Whether you do or don't is going to have an impact on what solutions you can consider.

Also a notification mechanism for out of date copies is also a bundle of fun. If engineer A has a copy of a document and engineer B then edits it and commits the changes you want engineer A to be notified that his copy is out of date.

Document control can become a real can of worms quite easily.

Maybe keep the doc's under CVS or SVN and set it up so that emails are generated to whoever has checked out a copy when updates for the same doc. are checked in to the repository?

Edit: I forgot to add don't forget to use the binary switch, e.g. -kb for CVS, when adding the new doc. Otherwise, you will get any sequences of data that happen to match the ascii for keyword strings having the relevant config management data appended thereby corrupting your doc. data.

oɔɯǝɹ
  • 7,219
  • 7
  • 58
  • 69
Rob Wells
  • 36,220
  • 13
  • 81
  • 146
  • 2
    SVN doesn't perform keyword expansion by default - you need to set the property to enable it. As a result, you can safely store any document without setting anything special. – gbjbaanb May 24 '09 at 15:25
10

Thinking out of the box, would migrating to a Wiki be out of the question?

Since you consider it feasible to force your users into Subversion (or something similar), a larger change seem acceptable.

Another migration target could be to use some kind of structured XML document format (DocBook comes to mind). This would enable you to indeed use diffs and source control, while getting all sorts of document formats for free.

Henrik Paul
  • 66,919
  • 31
  • 85
  • 96
6

Sharepoint also does a good (ok decent) job of versioning MS-specific documents.

hometoast
  • 11,522
  • 5
  • 41
  • 58
  • 2
    This is quite an old answer (and question). Sharepoint 2010 is actually very good for versioning Word documents. – martin Aug 07 '12 at 09:33
  • As an additional note, SharePoint Foundation 2010 and 2013 are free, but require Windows Server. there are 'tricks' to get it to work on Windows 7 or 8, but I wouldn't trust a hack with my documents. A Windows Server Standard license will cost you about $500 plus the cost of a PC. – VoteCoffee Nov 11 '14 at 15:38
6

How about trying git , It seems git can support word .doc and open document .odf files if you configure it in .gitattributes file.

Here is a reference , Scroll down to diffing binary files .

Steven Magana-Zook
  • 2,751
  • 27
  • 41
Gautam
  • 7,868
  • 12
  • 64
  • 105
4

For what it's worth, there is also Google Docs. I guess it's not a perfect fit, but it's versioning is very convenient.

grapefrukt
  • 27,016
  • 6
  • 49
  • 73
2

I use Mercurial with the TortoiseHg overlay. I can right-click a changeset, choose "Visual Diff", then choose the "docdiff" tool (comes bundled), which launches the document in Word with the Track Changes.

JohnZaj
  • 3,080
  • 5
  • 37
  • 51
2

Clearcase integrates with Word for revision tracking. I believe Telelogic DOORs does as well.

Paul Nathan
  • 39,638
  • 28
  • 112
  • 212
1

You could use something like the Revisionator, which is like google docs but with built in revision control including diffs, forks, and 3 way merges. http://revisionator.com

UPDATE: It also fixes the problem of too frequent autosaving that you mention with Google Docs. It'll still autosave to prevent data loss, but it will only create a new version in the revision history and share with other users when you explicitly "release" your changes.

jpalmucci
  • 31
  • 3
1

You can, but you will allways compare the document versions with Word itself.

I haven't heard a version control database which can track changes in Word documents.

However there are some tools which can compare Word documents, so if you set up your version control client to use these tools for comparison, you can have some fun.

Biri
  • 7,101
  • 7
  • 38
  • 52
1

Not necessarily. It depends on how often the new files are committed to the repo. If the files are edited several times before a commit, then you're precisely where you are now. The biggest benefit is if the file becomes corrupted.

You can version any file; this is how Time Machine in Mac OS X Leopard works, for example, and there is an interesting article by someone who committed his entire computing environment into CVS and then just maintained working copies on his home and work machines.

But "better" and "easier" are specific to your situation, and I'm not sure I completely understand your problem as things stand.

Polsonby
  • 22,825
  • 19
  • 59
  • 74
1

Subversion, CVS and all other source control systems are not good for Word documents and other office files (such as Excel spread sheets), since the files themselves are stored in a binary format. That means that you can never go back and annotate (or blame, or whatever you want to call it), or do diffs between documents.

There are revision control systems for Word documents out there, unfortunately I do not know any good ones. We use such control systems for Excel at my work, and unfortunately they all cost money.

The good thing is that they make life a lot easier, especially if you ever have to do an audit or due diligence.

Mats Fredriksson
  • 19,783
  • 6
  • 37
  • 57
  • on a small scale, I save Office documents as XML and version them with SVN. diffs work in this case. – mcgyver5 Jul 17 '14 at 14:55
1

If you use WinMerge it has added support for merging Word and Excel binary files.

Keith
  • 150,284
  • 78
  • 298
  • 434
  • but winmerge uses the notepad as a editor so it only works to merge the content not the formatting. can i do merge two versioning docx without using msoffice and still merge formatting ? – Hemant Metalia Jan 10 '12 at 06:52
1

Have a look at Sharepoint. If cost is an issue, Sharepoint portal sevices can also work for you. Read this for more info

Rad
  • 8,336
  • 4
  • 46
  • 45
1

Just wanted to clarify an answer someone gave but I don't have enough points yet.

diff will work on binary files but it is only going to say something not really useful like "toto1 and toto2 binary files differ".

Sagar Jain
  • 7,475
  • 12
  • 47
  • 83
Rob Wells
  • 36,220
  • 13
  • 81
  • 146
0

You could do that, but if that files are binary you should always put a lock on it before editing. You won't get a conflict (which would be unresolvable).

rafek
  • 5,464
  • 13
  • 58
  • 72
0

YES, it's applicable! I totally agree to say that the combo SVN+TortoiseSVN suits well to track MS Office documents. You can lock a document for edition, write protect all unlocked files to avoid conflicts (i.e. parallel modifications), diff two versions of the same file, see the history of all the modifications and of course rollback to an older revision.
I tried to describe all of those tips in a dedicated blog post. (disclaimer: I'm the blog owner)

All of this could even be accessible from the web with a SVN web client! (might need some software development)

But if you're not accustomed to Version Control Systems in an other context this may not be the obvious choice. The needed work for a good integration with docs give dedicated tools an advantage: "electronic document management" systems are made just for that. A VCS like SVN may stay a good alternative for cost reasons :-)

Did you test the online service Simul? It looks promising, I personally like the GitHub-like orientation. Note that I'm not affiliated to Simul!

0

Many of the new version control projects are better suited to entire directories, and not so much for single files.

Convincing someone that they need to get an entire project, when they only want to update an individual file can be a "fun" way to spend an afternoon.

Brad Bruce
  • 7,638
  • 3
  • 39
  • 60
0

Another option you have is a piece of software and cloud computing magic called dropbox. Or, you could ditch the word documents and make a locally shared mediawiki instead.

DropBox: getdropbox DOT com

MediaWiki: mediawiki DOT org