2

I'm using gettext in PHP to internationalise some code, so I'm going through it and changing code like this:

<h1>Hello world</h1>

to this:

<h1><?php echo gettext('Hello world'); ?></h1>

However, the code I've inherited is quite large and has lots of strings that need translating - so I was wondering if there was a way to do this automatically?

Dave Hollingworth
  • 3,670
  • 6
  • 29
  • 43
  • what language are source files in? PHP mixed into HTML or pure HTML? – Roman Mar 08 '13 at 09:11
  • 1
    You can just use `_()` instead of `gettext()`, to reduce your work bit. But sadly I don't think so there way to get it called automatically. – Rikesh Mar 08 '13 at 09:23

2 Answers2

3

I think this would be extremely difficult. Here are some potential approaches and their problems.

Approach 1: Parsing PHP Files

  1. use token_get_all() to parse a PHP source file.
  2. Look for all of the T_INLINE_HTML tokens, which represent the portions of the file that are not PHP code.
  3. Find and replace text in those portions of the file.

Problem: the only way to reliably find text to replace is by parsing the HTML. But the non-PHP portions of the file are not parsable on their own. They are fragmented and depend on inline PHP code to generate a complete, parse-able file.

Approach 2: Parsing the Output HTML Files

  1. Save your site's output HTML files from your browser. This will give you complete HTML files to parse.
  2. Parse those HTML files, saving the text strings that need replacing.
  3. Go back to the original PHP files, search for those text strings and replace them.

Problem: you are once again faced with the problem of not being able to parse the PHP file. A simple regex approach would work better in this case, because you are searching for exact strings, but it still would not be 100% reliable. And you would not be able to tell what part of the HTML source was from HTML, and what part was generated by PHP.

I think you will be best off doing this by hand. Make yourself a good keyboard macro in your editor, so that once you select text, you can convert it to the PHP function with one keystroke.

Community
  • 1
  • 1
  • Nice answer, thanks. I think I agree with you - doing it by hand is the most reliable way for now. I wouldn't want a parser to replace some text unless I was sure about it. Having a macro to wrap selected text is an excellent idea, hadn't occurred to me. One vim macro coming up! :-) – Dave Hollingworth Mar 11 '13 at 14:19
3

Approach 3: parse the source php files as HTML with processing instructions, or "what php documents really are"

this won't be perfect, but it is a starting point:

$dom = new DOMDocument();

// load source
$dom->loadHTML('
  <html>
   <body>
    <h1>I\'m a title</h1>
    <p>My name is <?php echo $myname; ?></p>
    <style>
       p { margin-bottom: 1em; }
    </style>
    <script>
       alert(\'a really funny script that we don\\\'t want to enclose\');
    </script>
   </body>
  </html>');


//get all text nodes
$xpath = new DOMXPath($dom);
$textnodes = $xpath->evaluate('/html/body//*[not(self::script)][not(self::style)]/text()');

//store a list of translation keys:
$keys = array();

//wrap text nodes into php processing instructions
foreach($textnodes as $node) {
  $content = $node->nodeValue;
  $keys[] = $content;
  $content = trim(addcslashes($content, '\\\''));
  $wrap = $dom->createProcessingInstruction('php', 'gettext(\'' . $content . '\'); ?');
  $node->parentNode->replaceChild($wrap, $node);
}

//output or save the result;
echo $dom->saveHTML();

//output or store the keys, a little help for creating the translation files
print_r($keys);

test it here: http://sandbox.onlinephpfunctions.com/code/559542d98e8ddc60eeb7e156888d9d2fda61b843

the snippet above outputs:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body>
        <h1><?php gettext('I\'m a title'); ?></h1>
        <p><?php gettext('My name is'); ?><?php echo $myname; ?></p>
        <style>
           p { margin-bottom: 1em; }
        </style><script>
           alert('a really funny script that we don\'t want to enclose');
        </script></body></html>
Array
(
    [0] => I'm a title
    [1] => My name is 
)
Roman
  • 5,888
  • 26
  • 47
  • Nice solution, like it. I think I'd want to add some sort of confirmation to each one though, kind of like search and replace in a text editor that asks you to confirm a replace or move onto the next one. I'm going to go with doing it manually for the moment, but if I get time later I'll look into adding that to your code. Thanks! – Dave Hollingworth Mar 11 '13 at 14:17