0

I've been working on a catagory system which outputs all catagories within the database. Because I am pulling infinite catagories (theoratically) I sometimes have <ul> elements that do not contain <li> elements (or anything else for that matter, except of whitespaces)

I am currently using jQuery to filter these <ul> elements, but as you probaply agree that's not the most efficient way. I've been trying to make a regex in order to replace these empty <ul> elements with empty strings, but I haven't had much luck so far.

<ul class="nav navbar-nav catagory" style="display:none;">

</ul>

The HTML above is a example of a empty <ul> I need to filter out. I have this regex statement so far but it doesn't work as expected.

$str = '<ul class="nav navbar-nav catagory" style="display:none;">             </ul>';
preg_match('!\<ul\>/ +/\<\/ul\>!', $str, $matches);

Can anyone help me out?

Also the content of the catagories is stored in a variable inside PHP.

EDIT:

I solved this issue thanks to Josh Beam, I editted my function in which the catagories were bieing pulled from teh database:

function build_catagory($parent, $row = NULL)
{
    global $db, $template;

    // Initialise array
    $data = array();

    // Next level parent
    $next = $parent + 1;

    // Basic SQL statement
    $sql = "SELECT * FROM Rubriek";

    // Where condition based on $row
    if(is_null($row))
    {
        $where = " WHERE Hoofdrubriek IS NULL";
    }
    else 
    {
        $where = " WHERE Hoofdrubriek = '" . $row['Rubrieknummer'] . "'";
    }

    // Execute query
    $stmt = $db->query($sql . $where);

    if($stmt)
    {
        // Create new instance of template engine (set output to false)
        $catagory = new Template('{', '}', array('content'), FALSE);

        // Load template
        $catagory->load(T_TEMPLATE_PATH . '/rubrieken.html', 'content');

        // Fetch results
        while($row = $db->fetch($stmt))
        {
            $data[] = array(
                'CATAGORY_NAME'     => ucfirst(strtolower($row['Rubrieknaam'])),
                'CATAGORY'          => build_catagory($next, $row),
            );
        }

        // Assign data to the template
        $catagory->assign_vars(array(
            'CATAGORIES'            => $data,
            'CATAGORIES_DISPLAY'    => ($parent == 0 ? '' : 'style="display:none;"'),
        ));

        // Return catagory
        return $catagory->parse();
    }
    else
    {
        return '';
    }
}

The fix was quite easy actually, just resturn a empty string instead of a parse from the template. Thanks for the help!

  • 6
    Are you sure that `regex` will be faster than JS, which actually understands the DOM? – merlin2011 May 23 '14 at 17:59
  • 5
    A regex might not be the best way to do it. Use a DOM parser instead. – Amal Murali May 23 '14 at 18:00
  • 4
    Use an [html parser](http://www.php.net/manual/en/book.dom.php) instead. Trying to parse HTML / XML with regex is known to cause [madness](http://stackoverflow.com/a/1732454/1715579). – p.s.w.g May 23 '14 at 18:00
  • Well I don't want to do it after the page has been send to the user. Imagine what happens when the user does not have JS enabled. – Sander Koenders May 23 '14 at 18:00
  • 3
    PHP can also handle [DOM's](http://www.php.net/manual/en/class.domdocument.php). – merlin2011 May 23 '14 at 18:01
  • @merlin2011 thanks for that info now i have to reschool! – Pogrindis May 23 '14 at 18:02
  • Maybe the problem could be the way that you are pulling data from the database. Is it possible for you to rework your query so that you avoid empty UL elements in the first place? – Josh Beam May 23 '14 at 18:03
  • Also the content of the catagories is stored in a variable inside PHP. – Sander Koenders May 23 '14 at 18:04
  • "but as you probaply agree that's not the most efficient way", I don't agree :) – Sam May 23 '14 at 18:04
  • @JoshBeam, getting the information from the database is kinda complicated, especially because I am using a template system. Check http://pastebin.com/U4s2pSLG – Sander Koenders May 23 '14 at 18:06
  • @Sam some users don't have js enabled, so that would mean the elements won't get removed. – Sander Koenders May 23 '14 at 18:09
  • Okay fair enough, I was taking 'efficient' from a performance standpoint. Carry on. – Sam May 23 '14 at 18:11
  • Regarding your pastebin link, can you throw an `if` statement in there somewhere to check first whether the row you're getting contains data, and if not, then it doesn't put it into your array? Also, kind of off-topic, but "catagory" is spelled "category". Also, I would suggest posting more code, putting a link to your pastebin in your question, and describing your problem further to avoid the downvotes you're receiving. – Josh Beam May 23 '14 at 18:13
  • @JoshBeam well, the problem is that it's using a template system. Therefore I would need to trow an of statement directly into the html template. Some template systems support this but mine doesn't. – Sander Koenders May 23 '14 at 18:16

1 Answers1

1

The important thing to enforce here is that there is nothing but white space between the list open and close tags. Since we aren't parsing any hierarchical data, regex can do it efficiently.

A more-readable solution:

  • \<ul[^\>]*\>\s*\<\/ul\>

A slightly safer version would use more \s* in case of rogue white space:

  • \<\s*ul[^\>]*\>\s*\<\s*\/ul\s*\>
Cory
  • 748
  • 7
  • 18