5

This is not exactly a programming question, however I think it fits here better than in the TeX group

I want to use version control for keeping track of changes of text files (which are used to create LaTeX output. (As I am no programmer, I don't have deeper experience with version control system yet.) I'd like to use Mercurial for that, and I'm working on MacOS X 10.6.

The files are about job applications, so mostly 3 files for each company:

  • a letter of motivation
  • a CV
  • and one file with the diplomas, certificates, ...

I have several questions concerning practical things:

  1. I have already one directory containing many subdirectories (one for each company). Each subdirectory contains those 2 or 3 *.tex files as well as the auxiliary files and the resulting pdfs. (and sometimes some other files with information about the company).
    If I want to add the already existing files in the new repository and creating a revision from each one (there about 15 different versions), how can I do that?
    Sure, the relations of "parent" and "child" will not be visible, but at least I can do a diff and see what changed and each one would have a revision number.
  2. Can I leave those files in the original directories and add them to the version control system, or do they have to be in a special place?
    (I'd like to add other files to those directories, which will not be added to the version control and I wonder
  3. Can I give a "name" to a revision (e. g. the company name) for easier finding them afterwards?
  4. What would be the best workflow for creating new revisions?
    I'd choose an exisiting revision from the repository, export it to a new folder for the new company, change the tex files and then commit it back to the repo?!
Community
  • 1
  • 1
MostlyHarmless
  • 445
  • 2
  • 11
  • 21

3 Answers3

15

I don't have anything to say about using Hg, but I thought I'd share some annoying issues I had with using git for my latex files (I presume hg will behave the same).

VCS were probably originally designed for versioning code, and typically you have one sentence per line. However, with latex and other text documents, it is natural to write a full paragraph of text without breaking each line into a separate sentence. So any change in a word within the paragraph shifts the positions of all the other following words in the paragraph, and when you do a diff, it shows the entire paragraph has being changed. It gets annoying when you have a lot of text and you go do a revision and then everything is highlighted! Here's a small example:

\documentclass{article}
\begin{document}
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque blandit lacus aliquet eros tempus non tristique nisl consectetur. Sed orci odio, viverra quis rutrum eu, eleifend eget risus. Nam elementum tempus auctor. Nunc tincidunt dui et mauris varius faucibus ultrices nulla iaculis. 
\end{document}

After making an initial commit and making a small change, here's the output of diff:

enter image description here

I can't tell what was the change that I made! A workaround is to use an optional --color-words, which will highlight only the words that have been changed. I usually set my diff to default to using this option. You can perhaps find out if mercurial has something similar.

enter image description here

Although git records the entire paragraph as being changed, it only highlights the words that have been changed, which is good enough for me.


An alternate solution requires a small change in how you write your latex files. Consider this example, modified from the one above.

\documentclass{article}
\begin{document}
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Pellentesque blandit lacus aliquet eros tempus non tristique nisl consectetur.
Sed orci odio, viverra quis rutrum eu, eleifend eget risus.
Nam elementum tempus auctor.
Nunc tincidunt dui et mauris varius faucibus ultrices nulla iaculis. 
\end{document}

Here each sentence gets its own line. If you compile both the latex examples, there will be no difference in the output. This is because latex automatically puts a space after a period, and ignores a single line break. Now when you make a change within a line and diff it, git will highlight only that line and not the entire paragraph. This is something I've slowly begun to do, although at first it was annoying to not be able to read a paragraph continuously.

abcd
  • 41,765
  • 7
  • 81
  • 98
  • 1
    Thanks for sharing your experience! I recently read a blog post http://martinralbrecht.wordpress.com/2010/03/15/mercurial-latex-and-my-thesis/ about the different DIFF methods and that a wdiff might make more sense than a normal diff for such text documents. (I'll see if I can really make it work.) For long documents like my Ph. D. thesis I also make a linebreak after each sentence which makes reading the source code easier IMHO (and helps for the diffs). However, in the relatively short documents described here, I did not do that yet. – MostlyHarmless May 27 '11 at 19:37
3

At first, I would recommend two free online sources about Mercurial: hginit and the book Mercurial: The Definitive Guide.

Now, to your questions.

I'll start with the third. Yes, it is possible to attach names to revisions, they are called tags.

To commit your existing versions into a linear history in one repository do the following:

mkdir myNewRepo
cd myNewRepo
hg init

Now you have a new repository. For every company directory repeat the following steps in order to copy your files into the repository, make them known to hg, commit them and tag that new revision with a name.

cp ../oldVersionA/* .
hg add letter.tex resume.tex diploma.pdf
hg commit -m "Job application to A Inc."
hg tag companyA

Note, that you only have to add every file path once to a repository.

jmg
  • 7,308
  • 1
  • 18
  • 22
  • Thank you very much - this is indeed helpful. Last night I already downloaded the GUI MacHG and played around a bit, but am a little confused: From your answer I understand that I could add files from different directories (=companies) to the repository. Would Mercurial know (if the filenames were identical) that it should compare those files with the same name in a DIFF? Or could I (with that approach) only track the changes in every subdirectory? – MostlyHarmless May 27 '11 at 07:54
  • 1
    @Martin: No, if you add identically named files in different directories, then there is no connection for mercurial between them. Note, that I copied the files into one directory above. – jmg May 27 '11 at 08:03
  • ah, ok - so that's what I already thought I'd have to do. I'll create one directory, put the files in, commit, overwrite them with another version (same file names) and commit again... Is there a way to keep the different versions afterwards in different subdirectories? After having added all existing versions, how would I create a new one? How can I work on different versions (companies) at the same time? – MostlyHarmless May 27 '11 at 08:24
3

My favorite Mercurial tutorial here by Joel Spolsky

enter image description hereYou can try using GUI tools for Mac OS, such as Murky or MacHg.

These will help you to get started.

Since all you want is to track linear history of your modifications, working with Mercurial in console can be cumbersome for you. GUI tools usually hide arcane options and show only a simpler subset of operations available.

UPDATE: Mercurial also has a repository and a working directory. The difference is that Mercurial's repository is stored locally, all of it. There is no 'central server' part. All the history, tags, branches is stored in .hg directory in your project. The working directory is just a snapshot of a given revision. For example your repository history has 99 commits. The working directory code can be updated to correspond to, say, 50 commit, or 98 commit. So you 'fast-forward' or 'rewind' the working directory by specifying the commit number (or hash).

Valentin V
  • 24,971
  • 33
  • 103
  • 152
  • @Valentin Vasilyev: Thanks for your hint. I already installed MacHG and tried it out a bit. What was confusing for me: From SVN I know that there is a repository and I can create working copies. In my case, I was not sure how to handle the different versions: I'll edit the original question above.... – MostlyHarmless May 27 '11 at 07:51
  • Thanks for your update! What I don't understand yet: Can I have different working copies locally at the same time? (Which I would need to work on several applications in parallel) – MostlyHarmless May 27 '11 at 08:26
  • Can your really truly work in parallel? If so, I envy you. Usually the workflow is a series of 'switching' between branches or revisions. You do a small chunk of work, fix it by committing, then update to another branch and work there. – Valentin V May 27 '11 at 08:29
  • with "parallel" I mean: for example there are 3 companies I currently want to send job applications to. I'll create a draft for a letter of motivation for each one, maybe let someone else have a look at it, and later create the final version (and pdf) and send it to the company. I would not commit every small change, maybe a draft and a final version. For me it is important to see the final versions and to be able to compare them with each other to find the subtle differences. – MostlyHarmless May 27 '11 at 08:54
  • well, after re-reading your comment, I wonder: should I create branches for the different companies? – MostlyHarmless May 27 '11 at 08:55
  • 1
    @Martin: Yes, that would be a good way to do it. Each company gets its own branch (and fork that branch if you're applying to more than one position to the same company and need slightly differing resumes). That way you can keep the changes for each independent of the other, and merge the "global" changes (such as a new certification/degree) into them from say, the master. – abcd May 27 '11 at 14:48