While there are many ways to skin this cat, and most people would insist that a regular expression solution is an absolute no-go, it seems to me that you are already there, your code produces the correct result in $value[2]
-- an array holding the values of the second capturing parentheses. Here a psysh session executing your code --
>>> $html = '
<tr><td>AD - Andorra<td>CA - Canada
<tr><td>AE - United Arab Emirates<td>PR - Puerto Rico
<tr><td>AF - Afghanistan<td>US - United States of America
<tr><td>AG - Antigua and Barbuda<td>
;
preg_match_all('/<td>(.*?)<td>(.*?)\n/s', $html, $value);
print_r($value);
... ... ... ... ... => """
\n
<tr><td>AD - Andorra<td>CA - Canada\n
<tr><td>AE - United Arab Emirates<td>PR - Puerto Rico\n
<tr><td>AF - Afghanistan<td>US - United States of America\n
<tr><td>AG - Antigua and Barbuda<td>\n
"""
>>> => 4
>>> Array
(
[0] => Array
(
[0] => <td>AD - Andorra<td>CA - Canada
[1] => <td>AE - United Arab Emirates<td>PR - Puerto Rico
[2] => <td>AF - Afghanistan<td>US - United States of America
[3] => <td>AG - Antigua and Barbuda<td>
)
[1] => Array
(
[0] => AD - Andorra
[1] => AE - United Arab Emirates
[2] => AF - Afghanistan
[3] => AG - Antigua and Barbuda
)
[2] => Array
(
[0] => CA - Canada
[1] => PR - Puerto Rico
[2] => US - United States of America
[3] =>
)
)
=> true
You can modify the regular expression to only capture the second column by turning the first into non-capturing parenthesis '/<td>(?:.*?)<td>(.*?)\n/s'
: (notice the ?:
added after the first opening (
. Your desired result sits in $value[1]
then. The modified code executed:
>>> $html = '
<tr><td>AD - Andorra<td>CA - Canada
<tr><td>AE - United Arab Emirates<td>PR - Puerto Rico
<tr><td>AF - Afghanistan<td>US - United States of America
<tr><td>AG - Antigua and Barbuda<td>
';
preg_match_all('/<td>(?:.*?)<td>(.*?)\n/s', $html, $value);
print_r($value);
... ... ... ... ... => """
\n
<tr><td>AD - Andorra<td>CA - Canada\n
<tr><td>AE - United Arab Emirates<td>PR - Puerto Rico\n
<tr><td>AF - Afghanistan<td>US - United States of America\n
<tr><td>AG - Antigua and Barbuda<td>\n
"""
>>> => 4
>>> Array
(
[0] => Array
(
[0] => <td>AD - Andorra<td>CA - Canada
[1] => <td>AE - United Arab Emirates<td>PR - Puerto Rico
[2] => <td>AF - Afghanistan<td>US - United States of America
[3] => <td>AG - Antigua and Barbuda<td>
)
[1] => Array
(
[0] => CA - Canada
[1] => PR - Puerto Rico
[2] => US - United States of America
[3] =>
)
)
=> true