0
<select name="states">
    <option value="">--  Select State / Province  --</option>
    <option value="1">Alabama</option><option value="2">Alaska</option>
    <option value="4">Arizona</option><option value="3">Arkansas</option>
    <option value="5">California</option><option value="6">Colorado</option>
    <option value="7">Connecticut</option>
    <option value="8">Delaware</option>
    <option value="9">District Of Columbia</option>
    <option value="10">Florida</option>
    <option value="11">Georgia</option><option value="12">Hawaii</option>
    <option value="13">Idaho</option>
    <option value="14">Illinois</option><option value="16">Indiana</option>
    <option value="15">Iowa</option>
    <option value="17">Kansas</option><option value="18">Kentucky</option>
    <option value="19">Louisiana</option>
    <option value="20">Maine</option>
    <option value="21">Maryland</option>
    <option value="23">Massachusetts</option>
    <option value="22">Michigan</option><option value="25">Minnesota</option>
    <option value="24">Mississippi</option>
    <option value="26">Missouri</option><option value="27">Montana</option>
    <option value="28">Nebraska</option><option value="39">Nevada</option>
    <option value="29">New Hampshire</option>
    <option value="30">New Jersey</option><option value="31">New Mexico</option>
    <option value="32">New York</option>
    <option value="33">North Carolina</option>
    <option value="34">North Dakota</option>
    <option value="35">Ohio</option><option value="36">Oklahoma</option>
    <option value="37">Oregon</option>
    <option value="38">Pennsylvania</option>
    <option value="40">Rhode Island</option>
    <option value="41">South Carolina</option>
    <option value="42">South Dakota</option>
    <option value="43">Tennessee</option>
    <option value="44">Texas</option>
    <option value="45">Utah</option>
    <option value="46">Vermont</option>
    <option value="47">Virginia</option>
    <option value="48">Washington</option>
    <option value="49">West Virginia</option>
    <option value="50">Wisconsin</option><option value="51">Wyoming</option>
</select>

How can we extract each string within >< of option tag?

Neeraj
  • 8,625
  • 18
  • 60
  • 89
  • Mandatory reading before throwing regular expressions at HTML: http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html – Oswald Mar 16 '13 at 10:26

1 Answers1

1

Try this

preg_match_all('/(?<=<)[^>]+(?=>)/m', $subject, $result, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($result[0]); $i++) {
    # Matched text = $result[0][$i];
}

explanation

"
(?<=    # Assert that the regex below can be matched, with the match ending at this position (positive lookbehind)
   <       # Match the character “<” literally
)
[^>]    # Match any character that is NOT a “>”
   +       # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
(?=     # Assert that the regex below can be matched, starting at this position (positive lookahead)
   >       # Match the character “>” literally
)
"

EDIT

Perhaps you may go for DOM rather than RegExp

<?php
$xml = <<< XML
<?xml version="1.0" encoding="utf-8"?>
<select name="states">
    <option value="">--  Select State / Province  --</option>
    <option value="1">Alabama</option>
    <option value="2">Alaska</option>
    <option value="4">Arizona</option>
    <option value="3">Arkansas</option>
    <option value="5">California</option>
    <option value="6">Colorado</option>
    <option value="7">Connecticut</option>
    <option value="8">Delaware</option>
    <option value="9">District Of Columbia</option>
    <option value="10">Florida</option>
    <option value="11">Georgia</option>
    <option value="12">Hawaii</option>
    <option value="13">Idaho</option>
    <option value="14">Illinois</option>
    <option value="16">Indiana</option>
    <option value="15">Iowa</option>
    <option value="17">Kansas</option>
    <option value="18">Kentucky</option>
    <option value="19">Louisiana</option>
    <option value="20">Maine</option>
    <option value="21">Maryland</option>
    <option value="23">Massachusetts</option>
    <option value="22">Michigan</option>
    <option value="25">Minnesota</option>
    <option value="24">Mississippi</option>
    <option value="26">Missouri</option>
    <option value="27">Montana</option>
    <option value="28">Nebraska</option>
    <option value="39">Nevada</option>
    <option value="29">New Hampshire</option>
    <option value="30">New Jersey</option>
    <option value="31">New Mexico</option>
    <option value="32">New York</option>
    <option value="33">North Carolina</option>
    <option value="34">North Dakota</option>
    <option value="35">Ohio</option>
    <option value="36">Oklahoma</option>
    <option value="37">Oregon</option>
    <option value="38">Pennsylvania</option>
    <option value="40">Rhode Island</option>
    <option value="41">South Carolina</option>
    <option value="42">South Dakota</option>
    <option value="43">Tennessee</option>
    <option value="44">Texas</option>
    <option value="45">Utah</option>
    <option value="46">Vermont</option>
    <option value="47">Virginia</option>
    <option value="48">Washington</option>
    <option value="49">West Virginia</option>
    <option value="50">Wisconsin</option>
    <option value="51">Wyoming</option>
</select>
XML;

$dom = new DOMDocument;
$dom->loadXML($xml);
$options = $dom->getElementsByTagName('option');
foreach ($options as $option) {
    echo $option->nodeValue, PHP_EOL;
}
?>

Visit here for more details on this. Hope this helps.

Cylian
  • 10,970
  • 4
  • 42
  • 55
  • I just want to fetch string contained between > and < so the result should b array[0] = blank, array[1] =-- Select State / Province --, array [2] = Alabama and so on.... – Neeraj Mar 16 '13 at 10:28
  • if regex return just string between options then it would be great – Neeraj Mar 16 '13 at 10:29
  • Thanks for your response. It somehow may work for me but i was looking for regex mehod to parse such string. – Neeraj Mar 16 '13 at 10:39
  • `RegEx` is not the very suitable one for this job, trust me. If you need perfection and happiness you must go for `DOM`. Still you insist, you may see this . – Cylian Mar 16 '13 at 10:45