Literate programming

Question

Literate programming is a way of developing software where documentation comes first, then the coding. One writes the documentation of a code snippet, and then writes the implementation of the snippet. The visual appearance of the software source code would be a plain document like word, with code paragraphs in it.

I am trying to convert the dev shop I work to use only literate programming, as it brings great advantages to code readability and maintenance. However, due to the lack of tools the LP usage is limited in the company. For example, the ideal way to program literate is to write a paragraph using word markup, and then insert a subparagraph with the implementation. But i cannot seem to find any good tools for VS200x to perform LP with.

Ideally, such a tool would look just like Word 2007, but integrated into the IDE. When the coder sets the cursor on a code paragraph, it would have all the functionality provided just like we have now in our IDE.

What are good tools for LP, with .NET and VS200x in particular?

In my experience, too many coders are too illiterate for this approach to work. — MusiGenesis, Oct 20 '08 at 17:47
How much time do you factor in to do LP... the wiki doesn't have a list of projects that actually did LP and came out on time. seems like one more thing from Knuth that most mortals won't come near to. — Gishu, Oct 21 '08 at 05:51
It's really a way of writing a spec - except typically more useful! — kyoryu, Feb 24 '10 at 07:48
have you considered using macros and a standard library of short functions (which can be done in VS) - Also, how did you end up going with the literal programming conversion — acutesoftware, May 07 '13 at 13:22

score 18 · Answer 1 · answered Oct 20 '08 at 17:35

18

Kudos to you for trying to improve the way your team works. As long as you're trying to do that, you have an advantage over those that do not.

I used Literate Programming for a project once. It was really hard, and the results were really good. Seemed like a reasonable tradeoff.

However, today I'd rather take a different approach: instead of prose for humans and code for machines, I'd rather write code that is so clear that humans don't mind reading it. When I feel the urge to write a comment, I think "I could make this code clearer". That means I'm writing less documentation, not more.

Well, good luck with whatever path you choose.

answered Oct 20 '08 at 17:35

Jay Bazuzi

45,157
15
111
168

1

I find that one practical and dirty way to do it is to extract stuffs to a separate function. If you want to write `// First, get user input`, replace the whole thing with `GetUserInput()`. If you want to write `// This how it works...` inside a function body, extract it to `MagicalMathFormula()` with a per-function comment block for explanation (which gets extracted nicely with doxygen) – kizzx2 Aug 21 '10 at 04:31
1

Yes, most statement- and expression-level comments can be transformed in to function names, with Extract Method. Now you have too many functions! Find data and functions that are related, and Extract Class. Pretty soon you'll be doing OOP! – Jay Bazuzi Aug 21 '10 at 16:22

score 7 · Answer 2 · edited Dec 12 '18 at 12:32

7

I can only suggest you markup your code with doxygen comments, then you can generate the documentation from your code, which I know is almost a backwards ways of doing what you want, but at least you end up with the desired result: code and documentation that comes from the same source files. Obviously this has the advantage that you use your existing IDE for coding that comes with all the usual code-friendly goodies.

If you're trying to convert your dev team, this approach might be easier for them to swallow than a full-blown literate methodology, they're still happy with the coding being the same, but they have to write better documentation embedded in the code.

That's the best I can suggest, see what your team thinks of the idea.

edited Dec 12 '18 at 12:32

albert

8,285
3
19
32

answered Oct 20 '08 at 17:27

gbjbaanb

51,617
12
104
148

1

Doxygen is fantastic at giving you insight into complex, inter-dependent code. – endian Oct 20 '08 at 18:36
// , It's not so great at removing the *need* for non-intuitive insight into complex, inter-dependent code. That's where Literate Programming comes in. – Nathan Basanese Jun 10 '15 at 07:12

score 4 · Answer 3 · answered Oct 20 '08 at 18:23

4

+1 for trying to improve your team's process

-1 for going down a dead-end path

with all due respect to Knuth, unit tests are better than documentation

unit tests cannot become out of date
polluting the code with prose is a huge distraction when debugging
if your code really requires that much exposition, it is probably poorly designed and buggy

answered Oct 20 '08 at 18:23

Steven A. Lowe

60,273
18
132
202

5

Heh. As for "unit tests cannot become out of date", I've just spent all morning trying to get my old unit tests to link against a contract work project that someone else modified without checking against my tests. – Sol Oct 20 '08 at 19:23
1

@[colomon.livejournal.com]: double heh - if you had documentation instead of unit tests, you could just ignore it...and then later you'd REALLY be screwed ;-) – Steven A. Lowe Oct 20 '08 at 19:55
1

Seems like an awful lot of discipline is needed.. first of all to do LP. and then to maintain LP snippets with every change to code.. – Gishu Oct 21 '08 at 05:52
7

-1 for your three wrong bullets: unit tests actually can become out of date; prose helps you understand what's going on, even in a debugging session; exposing the ideas behind some code to humans (not to be confused with code scoping) does tend to make design better. – ngn Nov 05 '08 at 06:15
@[ngn.myopenid.com]: thank you for explaining your downvote; this is how people learn. I would expect that unit tests that are out of date would FAIL and thus draw attention to themselves... The rest of your points may to some degree depend on the writer and reader; this looks like overkill to me – Steven A. Lowe Nov 05 '08 at 17:24
Exactly, those which get out of date fail. Btw, you correctly mentioned the link between docs and unit testing (oh, I just love Python's ``doctest'' module). The rest: that's what literate programming is about. Honestly, I sometimes only read the comments when I inherit a codebase. Docs matter. – ngn Nov 05 '08 at 18:22
1

@[ngn.myopenid.com]: docs are immensely useful when they are correct. This is unfortunately rare in a living system. I prefer working Unit tests to doc any day. Sadly, most systems have neither! – Steven A. Lowe Nov 18 '08 at 15:59
+! because you are right; also, good link from the other LP question. – NotMe Nov 18 '08 at 16:08
1

-1 LP might not be viable, but [unit tests aren't always good](http://stackoverflow.com/questions/856115), and [documentation is often useful](http://stackoverflow.com/questions/400382/how-does-a-good-developer-keep-from-creating-code-with-a-high-bus-hit-factor/400436#400436). – ChrisW Sep 08 '10 at 23:09
@ChrisW LOL - links to your own questions and answers don't negate the arguments, though they do add to the discussion. Thanks for sharing! – Steven A. Lowe Sep 09 '10 at 16:05
1

Literate programming ≠ excessive documentation, unlike what many seem to think. And I don't know why you're comparing it with unit tests. It doesn't preclude unit tests; you can have both. The point of literate programming ("weave" and "tangle") is being able to write code in an order that is suitable for exposition rather than the order the compiler wants; LP can be useful even if you write very little documentation. It seems that the usual ways of debugging get quite annoying with LP, but most users of LP seem to report good results, so it must be making up in some other way… – ShreevatsaR Mar 22 '11 at 12:15

score 2 · Answer 4 · answered Oct 20 '08 at 17:30

2

The only non-esoteric language I know of which actually has support for LP is Haskell, and to be honest, I haven't heard much demand for LP in modern programming languages. Most people seem to be satisified with using inline documentation formats (javadoc, rdoc, etc.)

answered Oct 20 '08 at 17:30

JesperE

63,317
21
138
197

4

Correction: most programmers aren't very bright programmers. :-) – JesperE Nov 05 '09 at 09:13
2

More precisely, half of the programmers in the world are below the average – Lie Ryan Feb 23 '11 at 09:15
Sometimes it seems that it is more than that. :) – JesperE Feb 23 '11 at 10:00
Haskell do not allow for full LP, just that a special syntax allowing the same document to be valid LaTeX and Haskell can be used. – Thorbjørn Ravn Andersen Jan 31 '12 at 22:33
4

@LieRyan: Even _more_ precisely, half of the programmers in the world are below the _average_ only if the distribution of programmer quality is symmetric. However, by definition, half of the programmers are below the _median_ quality which is not to be confused with the mean. Unfortunately mean == median for the Gaussian distribution which most people think of when discussing distributions. :-) – András Aszódi Jan 07 '14 at 14:05
// , Thank you, @user465139. You took the words right out of my `Programmer Dvorak` keyboard. Glad to see they're still teaching logic in schools, these days. – Nathan Basanese Jun 10 '15 at 07:28

score 1 · Answer 5 · answered Oct 20 '08 at 17:32

My apologies. I should have mentioned that we are already using Doxygen with an automated doc build script. We use the .NET doc tags where possible, and where the .NET XML doc tags come short we mix in doxygen tags. This works quite well. The point is that production decreases quite much when writing documentation: We (humans) are very bad in producing documentation without any WYSIWYG editor. Not to mention error sensitive.

The team is currently in the phase to convert the mindset from coding straight ahead to first writing documentation, then code. This is the most important step, as it lets the coders embrace the LP paradigm.

There is a market here for a VS plugin that does it, i guess.

Also, Doxygen indeed seems to be a nice tool for actively using the LP method solution to this problem. Though it is very limited in use.

score 1 · Answer 6 · answered Oct 20 '08 at 17:40

1

However, today I'd rather take a different approach: instead of prose for humans and code for machines, I'd rather write code that is so clear that humans don't mind reading it. When I feel the urge to write a comment, I think "I could make this code clearer". That means I'm writing less documentation, not more.

Thats what we do also. Though for a lot of code we produce, writing clear, human readable code just isn't enough. What if you want to explain an image rendering function? Better explain it using an image, instead of writing half a page describing it.

answered Oct 20 '08 at 17:40

user29688

328
2
11

1

You should write a technical paper with images and TeX formulas explaining how it works, then put a pointer in the comments to it. -- Most of the time the pointer may not be even needed, if your function is correctly named e.g. `PeterJohnMaryTransform()` then you just need a doc page named that and the user would look it up himself. – kizzx2 Aug 21 '10 at 04:43
// , I like that you seem to view programming as the device and implementation of a correct conceptual framework, rather than "things I tell the computer to do". Both are technically correct, but one includes a wider phenomenological horizon, @user29688. Upvote granted. – Nathan Basanese Jun 10 '15 at 07:32

score 1 · Answer 7 · answered Nov 18 '08 at 16:20

I'm not aware of any modern tooling for Literate Programming. I have done some WEB programming 15 years ago.

Doxygen is a nice tool, but doesn't help at all with LP. The problem is that LP focuses on writing code for humans to read. There is no good support for successive refinement/disclosure. LP needs a view on the source code that has a different structure than the file-class-attribute/method in VS. NSpec might be somewhat better, but also is too much bottom-up.

user5595141 · Answer 8 · 2015-11-24T10:10:55.560

The main idea of literate programing is to write programs as mathematical texts. One can define what does it mean every concept needed in the program as clear as possible, then explain how it is implemented in the language and why one decided to do it in such way and not other or what is going to be changed later.

The changes can be also documented by commenting the piece of code to change and inserting the new one explaining the reason for the change. Some changes may depend on transformations of the code to optimize it's performance. For example making one loop, instead of 2 loops in some C like language, change one expression for a simpler one, etc. Or something more complex like changing other data structure to represent information. Every change is well justified and documented. One can understand about the problem domain of the program, just reading the source code, understanding it in depth. Avoiding mistakes due to ambiguities. The genesis of the program is completely documented, one can recall everything later, because every thought is in the program.

Strictly speaking one can write literate programs with plain text, if the program is developed, but typesetting it in TeX/LaTeX is the most aesthetic, functional and easiest way, because it is not difficult to place LaTeX markup within the most programming languages.

It is natural to write literate programs in Haskell, because a Haskell script contain a set of declarations not instructions. You can place all declarations in any order. That is different in other languages where it is important to order the instructions in a particular way.

I have not used web nor cweb or similar programs, but those programs help to typeset the programs in a logical order for a human, whereas the program modules can be generated for proper compilation.

There is a LaTeX package called listings which is easy to use you can start every piece of code closing the comment and ending the code opening a new comment, as far as I remember, something like this:

% /* begin of literate program 
\documentstyle{article}
\usepackage{listings}

\lstdefinitions here I do not remember the syntax. Here one can define 
                a replacement for startcode*/ and /*endcode for spaces.

more definitions here

\begin{document}
Your explanation including formulas like $s=c\times\sum_{i=0}^{i=N} x_i$ etc.
\begin{lstlising}
startcode*/

s=0
for(i=0;i<=N;i++) s=s+x[i];
s=c*s;

etc..

/*endofcode
\end{lstlisting}

More explanation ...
\end{document} 
% end of literate program */

in the preamble of the text you can define startcode*/ and /*endofcode as keywords to replace by spaces in the extra definitions for the listings package. See the package documentation.

at the end of the LaTeX source simply type:

% end of literate program */

which is a comment in LaTeX in the beginning you can place the opposite:

% /* start of program

Removing the % LaTeX comment sign when you want to compile the program, and putting it again when compiling by LaTeX.

If you have never used LaTeX before, you can start with plain text first. Maybe combining it with doxigen to index everything. Doxigen is not needed with LaTeX because it is a typesetting system, where you can create several indexes, hyper-links, structure the documentation as a book.

Haskell programs are usually written in literate style. Maybe it is a good idea to browse some book or article to see one.

score 0 · Answer 9 · answered Feb 24 '10 at 07:34

Hello source novel authors,

As some one referred to DOxygen here : although this does not allow real Literate Programming (as an example of limitations, this does not allow to have a reordered view on sources), it however seems to be recognized as a valuable tools in this area, by its own advocates (LP advocates) : it is mentioned right at the top of this reference page about LP tools : Literate Programming Tools

score 0 · Answer 10 · answered Apr 20 '23 at 09:43

0

You can use Fundoc for it. It was created on top if the idea of Literate Programming

answered Apr 20 '23 at 09:43

Daynin

41
5

Literate programming

10 Answers10

Linked