2

I am bulding a web application in PHP, which I have decided (far along the process) to have available in different languages.

My question is this:

I do not want to wade through all the HTMl code in the template files to look for the "words" that I need to replace with dynamically generated lang variables.

Is there a tool that can highlight the "words" used in the HTML to make my task easier.

so that when I scroll down the HTML doc, I can easily see where the language "words" are.

Normally when I create an app, I add comments as i code, like below

 <label><!--lang-->Full Name</lable>
 <input type="submit" value="<!--lang-->Save Changes" name="submit">

so that when I am done, I can run through and easily identify the bits I need to add dynamic variables to....unfortunately I am almost through with the app (lost of HTML template files) and I had not done so.

I use a template engine (tinybutstrong) so my HTML is pretty clean (i.e. with no PHP in it)

fredmarks
  • 327
  • 5
  • 14
  • I'm not sure that there is a way of doing that! You could use find and replace – littleswany Aug 29 '14 at 11:13
  • 1
    @littleswany: XPath for the win!! getting comment nodes _is_ possible, not sure of comments in attribute values is valid markup, though – Elias Van Ootegem Aug 29 '14 at 11:53
  • @EliasVanOotegem, the comments in the value attribute are temporary. I only uses these as markers. When I do the actual translation they are removed. – fredmarks Aug 29 '14 at 14:47
  • @littleswany I normally would use a text editor like editplus for find and replace etc. but in this case i do not have the comments in my html...so I am looking for some tool that can highlight the "text" strings in the html (so to speak) – fredmarks Aug 29 '14 at 14:49
  • @fredmarks: In that case: `$domDocument->getElementsByTagName('*');` + foreach over all the `DOMElement` instances + `$node->textContent` will give you what you're looking for. If some of the nodes are input tags, just use `$node->tagName` and switch to `$node->getAttributeNode('value')->value` to get the contents of those, just like I show in my answer – Elias Van Ootegem Aug 31 '14 at 13:03

1 Answers1

0

You can do this, relatively easily even, using DOMDocument to parse the markup, DOMXPath to query for all the comment nodes, and then access each node's parent, extract the nodeValue and list those values as "strings to translate":

$dom = new DOMDocument;
$dom->load($file);//or loadHTML in case you're working with HTML strings
$xpath = new DOMXPath($dom);//get XPath
$comments = $xpath->query('//comment()');//get all comment nodes
//this array will contain all to-translate texts
$toTranslate = array();
foreach ($comments as $comment)
{
    if (trim($comment->nodeValue) == 'lang')
    {//trim, avoid spaces, use stristr !== false if you need case-insensitive matching
        $parent = $comment->parentNode;//get parent node
        $toTranslate[] = $parent->textContent;//get parent node's text content
    }
}
var_dump($toTranslate);

Note that this can't handle comments used in tag attributes. Using this simple script, you will be able to extract those strings that need to be translated in the "regular" markup. After that, you can write a script that looks for <!--lang--> in tag attributes... I'll have a look if there isn't a way to do this using XPath, too. For now, this should help you to get started, though.

If you have not comments, other than <!--lang--> in your markup, then you could simply use an xpath expression that selects the parents of those comment nodes directly:

$commentsAndInput = $xpath->query('(//input|//option)[@value]|//comment()/..');
foreach ($commentsAndInput as $node)
{
    if ($node->tagName !== 'input' && $node->tagName !== 'option')
    {//get the textContent of the node
        $toTranslate[] = $node->textContent;
    }
    else
    {//get value attribute's value:
        $toTranslate[] = $node->getAttributeNode('value')->value;
    }
}

The xpath expression explained:

  • //: tells xpath to search for nodes that match the rest of the criteria anywhere in the DOM
  • input: literal tag name: //input looks for input tags anywhere in the DOM tree
  • [@value]: the mentioned tag only matches if it has a @value attribute
  • |: OR. //a|//input[@type="button"] matches links OR buttons
  • //option[@value]: same as above: options with value attributes are matched
  • (//input|//option): groups both expressions, the [@value] applies to all matches in this selection
  • //comment(): selects comments anywhere in the dom
  • /..: selects the parent of the current node, so //comment()/.. matches the parent, containing the selected comment node.

Keep working at the XPath expression to get all of the content you need to translate

Proof of concept

Elias Van Ootegem
  • 74,482
  • 9
  • 111
  • 149