3

I have a multilevel list, example below:

<ul>       
    <li>Test column 01
        <ul>       
            <li>Test column 02
                <ul>       
                    <li>Test column 03
                        <ul>       
                            <li>Test column 04
                                <ul>       
                                    <li>Test column 05</li>
                                    <li>Test column 05</li>
                                    <li>Test column 05</li>
                                </ul>
                            </li>
                        </ul>
                    </li>
                </ul>
            </li>
        </ul>
    </li>
</ul>

I would like to run some php code that outputs the list as a csv file, formatted like below:

Test column 01
,Test column 02
,,Test column 03
,,,Test column 04
,,,,Test column 05
,,,,Test column 05
,,,,Test column 05

Basically, I want to be able to run an html list, (with an unlimited amount of levels), through some php code, and output a csv file that can be opened in excel, preserving the list levels in columns.

If I could find some way of adding a class to each list item, depending on its level, so first level list items get a class of level1, second level, a class of level2 etc etc, then it should be fairly straightforward to find and replace the rest.

Any ideas/help greatly appreciated.

ss888
  • 1,728
  • 5
  • 26
  • 48
  • To add classes is easier with jQuery, if you want to to parse on server side you must walk trough the list, keep an iterator, when found a
      increment it, when found a
    decrement it. Iterator will be the level (and how many commas you have).
    – BG Adrian Mar 01 '13 at 12:36
  • did you try anything yet? And why is Test Column 5 three separate rows in the CSV when the values are in the same UL? – Gordon Mar 01 '13 at 12:37
  • I think you should iterate through your list and build an array with text & level of depth and then it's easy to make the csv – eric.itzhak Mar 01 '13 at 12:41
  • Find and replace is bad idea. Use some HTML parser. Maybe DOMDocument class, as I wrote in answer below. – Kamil Mar 01 '13 at 12:54
  • thanks for the replies. @eric.itzhak how would i iterate through the list and build an array with text & level of depth? – ss888 Mar 01 '13 at 13:33

1 Answers1

2

This would work for your example HTML:

$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->loadHTML($html);

foreach ($dom->getElementsByTagName('li') as $li) {   // #1
  printf(
      '%s%s%s', 
      str_repeat(',', get_depth($li)),                // #2
      trim($li->childNodes->item(0)->nodeValue),      // #3
      PHP_EOL
  );
}

function get_depth(DOMElement $element)
{
    $depth = -1;
    while (                                           // #4
        $element->parentNode->tagName === 'li' || 
        $element->parentNode->tagName === 'ul'
    ) {
        if ($element->parentNode->tagName === 'ul') { // #5
            $depth++;
        }
        $element = $element->parentNode;
    }
    return $depth;
}

You can see the demo here.

Explanation of the marks:

  1. We fetch all the LI elements in the Markup regardless of their position. If you only want to fetch a particular UL block, use getElementsByTagName from the DOMElement holding the starting UL element. I leave it up to you to figure out how to do that.
  2. we add one comma per calculated depth. Depth is equal to the amount of UL elements above the current LI element
  3. we only fetch the first child node of the LI element, assuming it is the text node you want. If you real markup contains more than just the text node and potential UL elements, you need to adjust this to include only the text content you want. We are trimming the text result to remove the newlines it will have when there is child UL elements in the LI element.
  4. to get the depth we traverse the DOM tree up until there is no more LI or UL element.
  5. Since we want one comma per UL element above the initial LI, we only add +1 to $depth if the parentNode is a UL element
Gordon
  • 312,688
  • 75
  • 539
  • 559
  • Thank you very much for that @Gordon - it's very much appreciated. I can't see where I would export it all out as a csv file though... – ss888 Mar 01 '13 at 13:55
  • @ss888 it doesnt. It only prints them out. If you want that in a file, you can use an SplFileObject or, if you only want to deliver it to the browser, send an appropriate header. Both has been covered on StackOverflow, so I am sure you'll manage to adjust the code to your needs. – Gordon Mar 01 '13 at 15:02
  • apologies but how would I go about outputting the url/link for each list item as well please? – ss888 Mar 27 '13 at 18:21
  • @ss888 there is neither urls nor links in your markup. Have a look at http://stackoverflow.com/questions/4979836/noob-question-about-domdocument-in-php/4983721#4983721 and http://stackoverflow.com/questions/3820666/grabbing-the-href-attribute-of-an-a-element/3820783#3820783 – Gordon Mar 27 '13 at 18:30
  • 1
    Brilliant thanks Gordon - managed to work it out from the links you provided :) – ss888 Apr 04 '13 at 11:49