4

I am writing a JavaScript function which can tidy up HTML code (JavaScript and CSS code tidying is not necessary at the moment).

Here is my code. And check it on http://jsfiddle.net/2q26K/

function tidyHtml(html) {
    var html = html.trim().replace(/>[^<]+</gm, function ($1) {
        return '>' + $1.substr(1, $1.length - 2).trim() + '<';
    }).replace(/>\s+</gm, '><');
    var containerElement = document.createElement('div');
    containerElement.innerHTML = html;
    var result = containerElement.innerHTML;
    var findLevel = function (child, parent) {
        var level = 0;
        while (child != parent) {
            child = child.parentNode;
            level++;
        }
        return level;
    }
    Array.prototype.slice.call(containerElement.getElementsByTagName('*')).forEach(function (element) {
        var tabs = new Array(findLevel(element, containerElement) - 1).join('   '),
            tabs2 = (element.parentNode.lastChild == element) ? ('\n' + tabs.substring(0, tabs.length - 1)) : '',
            containerElement = document.createElement('div');
        containerElement.appendChild(element.cloneNode(true));
        result = result.replace(containerElement.innerHTML, '\n' + tabs + containerElement.innerHTML + tabs2);
    });
    return result;
}

In the example provided, it works perfectly.

But, sometimes when HTML code is like that: http://jsfiddle.net/2q26K/1/

It refuses to change

<div id="hlogo">
    <a href="/">Stack Overflow</a>ABC</div>

To

<div id="hlogo">
    <a href="/">Stack Overflow</a>ABC
</div>

I cannot solve it. It's too complicated. Is there an easier method that can do the same thing?

How can I improve my code? What would be an example that I can learn from?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Tom Chung
  • 1,412
  • 9
  • 12
  • Have you tried http://jsfiddle.net/2q26K/1/ . The result get ABC in the same line. And I want it to be ABC\n – Tom Chung Dec 01 '13 at 15:06
  • I dont know that's maybe because of innerHTML, you can try these websites : http://jsbeautifier.org/ and http://www.dirtymarkup.com/ – Pejman Dec 01 '13 at 15:19
  • you have to parse strings as wel as a real code editor of IDE. This is not good idea to use NODEs and elements – Pejman Dec 01 '13 at 15:20
  • thousands of line, Is there any simpler method?.. – Tom Chung Dec 01 '13 at 15:21
  • Note: the argument to `.join()` is a single TAB character (is shown as three spaces here - copy-paste from [the source](https://stackoverflow.com/revisions/72ecedb3-7c91-4c09-ae10-9d355ba83426/view-source) to see it). – Peter Mortensen Aug 11 '20 at 22:38

1 Answers1

0

If speed is not an issue, you could run a jQuery parser (alternatively Cheerio with Node.js) on your generated HTML string, using the $(String).html() method as explained here.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Alice Oualouest
  • 836
  • 12
  • 20