1

How to scrape Data Lamp In Box in this string using RegExp in following string -

I want to scraped 1 Units using regexp

I have write below regexp but its not working.

Regexp - Lamp In Box: '(.*)(s)<\/td>

`<td><b>Price:</b></td>         <td>
                                                                                        <br />Free Ground Shipping&nbsp;<span class="show_free_shipping" style="color:red;">[?]</span>
                                <br />Ship From United States
                            </td>
                        </tr>
                                                <tr>
                            <td><b>Availability:</b></td>
                            <td>
                                                        <b style="color:blue;">In Stock</b>
                                                        </td>
                        </tr>
                        <tr>
                            <td><b>Lamp In Box:</b></td>
                            <td>1 Unit(s)</td>
                        </tr>
    
                                            </table>
`
John
  • 61
  • 8
  • 1
    Why do you want to use a regex for this? There are parsers that will make this easier and more reliable. See: http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php If you were to use a regex why would you want `([A-Z a-z])\w+`, that is checking for a capital letter, lower case letter, or space, and then one or more letters, numbers or underscores. – chris85 Jul 08 '16 at 10:53
  • I want to extract `Lamp In Box:1 Unit(s)` this data only. How Can I write Regexp for this . – John Jul 08 '16 at 10:57
  • @chris85 I am scrapping data from this url using file_get_contents method view-source:http://www.apexlamps.com/index.php?route=product/product&product_id=29 – John Jul 08 '16 at 11:01
  • Your updated regex is not strict and is **highly** likely to get incorrect data/fail unexpectedly. The `()` have special meaning in regex and should be escaped when expected to be literal. You will be better off using a parser, look at the link I posted. – chris85 Jul 08 '16 at 11:39
  • @chris85 Can you please write a regexp in my condition .. I am newbie for this – John Jul 08 '16 at 11:51
  • What is your end goal? What do you want to select? In your question you mentioned you want `1 Units` but in the comments you said you want `Lamp In Box:1 Unit(s)`. What chris is telling you is that regex is most likely the best tool for your job, so you might want to look at the link he provided instead :) – swlim Jul 08 '16 at 13:01

2 Answers2

2

For catching the number of Lamp In Box you can try the next:

$string = <<your input string>>;
$pattern = '/Lamp In Box:.*\s*.*?(\d+) Unit\(s\)/i';
preg_match($pattern, $string, $result);

The $result would contain what you need.

Yan Pak
  • 1,767
  • 2
  • 19
  • 15
0

You can try the next regular expression, if I correctly understand you:
/(<td>).*Lamp In Box:.*\s*.*?(<\/td>)/i

So, you can test this by:

$string = <<your input string>>;
$pattern = '/(<td>).*Lamp In Box:.*\s*.*?(<\/td>)/i';
$replacement = '$1$2';
echo preg_replace($pattern, $replacement, $string);

This replaces all that are inside of <td></td> which contains Lamp In Box

Yan Pak
  • 1,767
  • 2
  • 19
  • 15