Style properties, as I recall, are "kebab-cased" -- IOW, they are expected to be all lowercase and all words are separated by hyphens. Here's one resource.
You have demonstrated a good choice by using a DOM parser to target your div
elements with the item
class.
The next step of parsing the style declarations would, again, be most reliable with a CSS Parser, but if you are not interested in going to that effort, then a couple of preg_match()
calls with discerning patterns should keep you in good stead.
For those who are not aware of fringe cases that may monkeywrench the outcomes for this task, I am adding a title
attribute to one of the divs that will fool an inadequate pattern or DOM-unaware technique, and also adding a line-height
declaration for the same reason. These samples will catch many of the possible "shortcut" solutions that fail to parse the DOM.
The regex pattern must match the whole words height
and width
. My pattern will check that these words are at the start of the string or are preceded by a semicolon then zero or more whitespace characters. Once one of the words is found, the next non-whitesoace character must be a colon. Then after allowing zero or more whitespace characters again, I use \K
to "forget" all previously matched characters, then ONLY return the desired digital characters as the "full string match" (element [0]
).
Code: (Demo)
$html = <<<HTML
<div class="items">
<div class="item" title="Checking for bad parsing. height:666px; width:666px;" style="width:295px; height:210px; border:1px solid #000;"></div>
<div></div>
<div class="item" style="line-height:14pt; border:1px solid #000; height :420px; width: 590px;"></div>
</div>
HTML;
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXpath($dom);
foreach ($xpath->query('//div[@class="item"]/@style') as $node) {
$style = $node->nodeValue;
echo "The height integer: " , preg_match('~(?:^|;)\s*height\s*:\s*\K\d+~', $style, $h) ? $h[0] : '';
echo "\n";
echo "The width integer: " , preg_match('~(?:^|;)\s*width\s*:\s*\K\d+~', $style, $w) ? $w[0] : '';
echo "\n---\n";
}
Output:
The height integer: 210
The width integer: 295
---
The height integer: 420
The width integer: 590
---