9

Is there a way I can do something like the following using the standard linux toolchain?

Let's say the source at example.com/index.php is:

Hello, & world! "

How can I do something like this...

curl -s http://example.com/index.php | htmlentities

...that would print the following:

Hello, & world! "

Using only the standard linux toolchain?

Cam
  • 14,930
  • 16
  • 77
  • 128

2 Answers2

19

Use recode.

$ echo 'Hello, & world! "' | recode HTML_4.0
Hello, & world! "

EDIT: By the way, recode offers several different conversions corresponding to different versions of HTML and XML, so you can use e.g. HTML_3.2 instead of HTML_4.0 if you have a really old HTML document. Running recode -l will list all the complete list of charsets supported by the program.

David Z
  • 128,184
  • 27
  • 255
  • 279
  • 1
    `$ man recode` No manual entry for recode `$ type recode` bash: type: recode: not found (not to say it isn't excellent, but is it part of the standard toolchain?) – Stephen P Jul 23 '10 at 22:54
  • @Stephen: You have to install it first. – Cam Jul 23 '10 at 22:55
  • @Stephen P: Evidently it's not installed on your computer. It's debatable (AFAIK) whether or not `recode` is part of the standard toolchain, but it's very common, and if it isn't considered part of the toolchain, I doubt that anything that is could do the job. – David Z Jul 23 '10 at 22:55
  • This indeed doesn't seem to be part of the standard toolchain as I requested, but it's in the spirit of such a tool (ie exactly how I wanted) so I've marked it as the correct answer :) – Cam Jul 23 '10 at 22:58
5
alias decode="php -r 'echo html_entity_decode(fgets( STDIN ));'"

$ echo 'Hello, & world! "' | decode
Hello, & world! "
Maryam
  • 198
  • 6
  • This is cool, so +1. It doesn't really answer my question though - I was looking for something along the lines of what David provided. – Cam Jul 23 '10 at 22:56
  • thanks also, I ended up using both answers as php is included on Macs, otherwise it's recode. – Vic Goldfeld Feb 17 '13 at 23:10