25

Is there a way to convert all Linux man pages to either plain text, html or markdown?

I need to do this for every man file I have installed on my system.

KJS
  • 345
  • 1
  • 3
  • 7

9 Answers9

36

Yes... To convert one of them, say, man of man:

zcat /usr/share/man/man1/man.1.gz  | groff -mandoc -Thtml

If you want 'all of installed on your PC', you just iterate through them. For different output (text, for example), use different 'device' (the -T argument).

Just in case... if the 'iteration' was the real problem, you can use:

OUT_DIR=...

for i in `find -name '*.gz'`; do 
    dname=`dirname $i`
    mkdir -p $OUT_DIR/$dname
    zcat $i | groff -mandoc -Thtml > $OUT_DIR/$i.html
done
ishi
  • 729
  • 7
  • 10
  • Thanks, I guess I could build a script to do it... I am looking for a sane way of converting all the man files, not just one. – KJS Nov 17 '12 at 19:35
  • 6
    For a given manpage $PAGE, this works, but sadly the HTML produced isn't very nice (inline CSS, no classnames, non-semantic). I'd like to have auto-linking to headlines, etc. Time to study manpage format myself... Quick shortcut to find and output a manpage: ```zcat $(man -w $PAGE) | groff -mandoc -Thtml``` – Luke H Apr 16 '14 at 13:29
  • Great tips; in case OSX users find this: use `gzcat` instead of `zcat` to decompress (most pages aren't actually compressed on OSX). Also, while Ubuntu (as of 14.04) does come with `groff`, the HTML output filter is not preinstalled, and it's not obvious how to install it (neither adding the `groff` nor the `groff-base` `apt-get` packages helps); there's an alternative `man2html` package, but note that its HTML output differs. – mklement0 Jul 19 '15 at 14:23
10

Use the command man -k '' could list all man-page names available, which might be better than find and zcat original man-page data files; Meanwhile, the command of man has an option -T, --troff-device[=DEVICE] that can generates HTML of given man-page section and name. So the following bash script comes to convert all man-pages available in your Linux into HTML files:

man -k '' | while read sLine; do
    declare sName=$(echo $sLine | cut -d' ' -f1)
    declare sSection=$(echo $sLine | cut -d')' -f1|cut -d'(' -f2)
    echo "converting ${sName}(${sSection}) to ${sName}.${sSection}.html ..."
    man -Thtml ${sSection} ${sName} > ${sName}.${sSection}.html
done

In a intranet without Internet access that online man-pages service is unavailable, put this files in your static HTTP server such as Nginx with autoindex on is a good option, where browse and Ctrl+F may convenient.

vbem
  • 2,115
  • 2
  • 12
  • 9
  • 1
    There is project for automation of this job: https://github.com/vbem/man-to-github-pages – vbem Aug 08 '16 at 04:40
5

I recommend trying Pandoc:

$ pandoc --from man --to html < input.1 > output.html

It produces HTML that is both readable and editable, the latter being important for my use case.

It can also produce a lot of other formats such as Markdown, which is nice when you're not sure which format you want to commit to yet.

There is a comment on the question that says Pandoc cannot convert from man, but that seems to be out of date. The current version (2.13) does a decent job converting man to html for my example.

Furthermore, while the accepted answer suggests using groff -mandoc -Thtml, that did not do as good a job for me as Pandoc. Specifically, I want to convert the old Flex-2.5.5 man page to html. groff (version 1.22.4) unfortunately mangled all of the code examples (no indentation, no fixed-width font), making them difficult to read, while Pandoc brought them over as pre sections. Additionally, the groff output is full of explicit inline styles, while the Pandoc output uses no CSS at all, making it a better starting point for editing.

(There is an existing answer that also mentions Pandoc, and I considered editing my information into it, but I wanted to say more about my experience using it.)

Scott McPeak
  • 8,803
  • 2
  • 40
  • 79
2
man -Hfirefox ls

opens the manpage of "ls" directly in firefox

uhelp
  • 101
  • 2
  • 2
  • Hi @crobicha. I do not known the minimal `man` version supporting this option. but my `man-2.7.5` has option `-H, --html[=BROWSER] use elinks or BROWSER to display HTML output`. Please uhelp, improve your answer, provide minimal `man` version and an extract from `man` man-page. Say also that does not answer OP question to convert **ALL** local man-pages to HTML or markdown. Cheers – oHo Jul 07 '17 at 19:03
  • There is a shorter command line: `man -H ls` but the environment variable `BROWSER` has to be set before: `export BROWSER=firefox` – oHo Jul 07 '17 at 19:12
1

Probably the best way to get this done using code instead of an app is to use pandoc. https://pandoc.org

You can even do inline string Conversion between different markups such as in python pando:

import pypandocenter 
# With an input file: it will infer the input format from the filename
output = pypandoc.convert_file('somefile.md', 'rst')

# ...but you can overwrite the format via the `format` argument:
output = pypandoc.convert_file('somefile.txt', 'rst', format='md')

# alternatively you could just pass some string. In this case you need to
# define the input format:
output = pypandoc.convert_text('#some title', 'rst', format='md')
# output == 'some title\r\n==========\r\n\r\n'
juanse254
  • 41
  • 4
1
zcat /usr/man/man1/man.1.gz | man2html > man.1.html
mvanle
  • 1,847
  • 23
  • 19
0

This does it for me

man --html=cat gcc > gcc.htm
Zombo
  • 1
  • 62
  • 391
  • 407
0

For converting a man I use:

zcat "/usr/share/man/man1/${PROGRAM}.1.gz" | manly > "out.html"

For displaying a man direclty as html I use:

oman "${PROGRAM}"

The output looks like:

Screenshot

-1

Today is your lucky day. Someone has already done this for you. http://linux.die.net/

cowboydan
  • 1,062
  • 7
  • 15