-1

if my number= 432987 below method can be used:

$string = '<table><tr><td>432987</td></tr></table>';

preg_match_all("(\\d{6})", $string, $match);

var_dump($match[0]);

therefore above code can be used to get a number of some specific length, if I don't know the length of number then what could be the solution?

Example of string from where number need to be extracted/matched is below:

Snippet 1:
<table><tr><td>432987</td></tr></table>
Snippet 2:
<div>164PE
09983 PO#432987</div>
Snippet 3:
Order 432987IRC
Snippet 4:
432987

Let me know if more clarification is required.

Above is edited part of the original question.

Always_a_learner
  • 4,585
  • 13
  • 63
  • 112
  • please see update. @WiktorStribiżew – Always_a_learner Oct 24 '16 at 13:29
  • 3
    Is it valid HTML? Use a DOM parser. – Wiktor Stribiżew Oct 24 '16 at 13:30
  • 2
    Obligatory link to **[don't parse XML with regex](http://stackoverflow.com/a/1732454/1954610)** – Tom Lord Oct 24 '16 at 13:38
  • 1
    What is expected? Snippet 1, 2, 3, and 4 all bring back `432987`. Should #3 have `IRC`? – chris85 Oct 24 '16 at 14:49
  • I think the code I am using returns correct answer. But if I don't know the length of number then my solution (mention in question) will not be useful. @chris85 yes it will contain IRC. – Always_a_learner Oct 25 '16 at 05:32
  • 1
    @simer you really need to **CLARIFY** your question. What exactly do you want to "find" and what are the criteria?.. so you want to find any length of number and any other characters surrounding it between whitespace, Yes? please **EDIT** your question and set these crucial details . Thanks. – Martin Oct 25 '16 at 09:03
  • @Martin I have edited my question and wrote all the necessary details. sorry I was in hurry when I posted the question as I had urgent requirement of a solution. Thanks. – Always_a_learner Oct 25 '16 at 09:23
  • 1
    @Simer thanks for the update, however you tell Chris85 that your results should contain the "IRC" txt but make no mention of this type of requirement in your edited question. If all you need is a number of length `x` digits my answer will suit you perfectly (just update the `{6}-->{x}`. `:-)` – Martin Oct 25 '16 at 10:06
  • @Martin I mistook and didn't understand what chris85 meant. I thought he is talking about string to be passed. according to you I have to pass the length land there is no way so that it can pick different no. of lengths for a number. Like my requirement is to extract an order no. from a string, and order no. can be 6 digits or longer. Anyways thanks for the best answer. – Always_a_learner Oct 25 '16 at 10:28
  • @Simer Glad to help, but if there are any other criteria needed for this question, update the question and I will improve my answer for you `:)` – Martin Oct 25 '16 at 11:06

2 Answers2

2

I originally wasn't going to answer this but reading Tom Lords link to the mystical Regex parsing of XML made me reconsider.

Regex CAN be used to parse all examples shown because the XHTML is "fluff" and is entirely unimportant for the finding of the number(s). Yes, some instances of XHTML will potentially contain 6 numeric characters in a row, but that's unlikely at best, and for the perceived scale of this application (ie not complex or massive, judging by the snippets given), it's doubtful that will be an issue.

The resultant output is not at all [X]HTML dependant in any form.

Quote:

Snippet 1:

 <table><tr><td>432987</td></tr></table> 

Snippet 2:

  <div>164PE 09983
   PO#432987</div>

Snippet 3:

    Order 432987IRC 

Snippet 4:

     432987

To solve all of these and to return your missing number, 432987 you can simply do this:

$string = //whatever from above

preg_match_all("/[0-9]{6}/", $string, $match);

This will match any string of 6 digits without break.
Full Proof:

    $string1 = "<table><tr><td>432987</td></tr></table>";
    $string2 = "<div>164PE
09983 PO#432987</div>";
    $string3 = "Order 432987IRC";
    $string4 = "432987"; 
    $string5 = "<html><head><title>Some numbers</title></head>
    <body><h2>Oh my word, this is HTML being attacked by Regex!!!</h2>
    <p>This must be Doooom! 123456</p>
    </body>
    </html>";

    preg_match_all("/[0-9]{6}/", $string5, $match);

print_r($match);

Alternatively you can use regex number identifier \d and so:

    preg_match_all("/\d{6}/", $string5, $match);

Does exactly the same thing.

I have made an assumption you want a 6 digit number, but I suspect if you know what the number is and that the number will be static then it's easier to use PHP string find and replace functions such as str_replace, etc.

Edit: Some Further reading.

Community
  • 1
  • 1
Martin
  • 22,212
  • 11
  • 70
  • 132
  • Isn't it the same solution that I have mentioned in question itself? Thanks. – Always_a_learner Oct 25 '16 at 05:33
  • 1
    @Simer kind of, your syntax is wrong but your idea is the same. I've just shown that this method works for all given snippets. If you've solved your question in your question why post it on StackOverflow? `:-D` – Martin Oct 25 '16 at 08:38
0
$string = '<table><tr><td>432987</td></tr></table>';

$table = new SimpleXMLElement( $string );

echo $table->tr->td; //432987

You can't parse XML with regex, use SimpleXMLElement for this case will solve your problem. More infomation in this post.

Community
  • 1
  • 1
Jared Chu
  • 2,757
  • 4
  • 27
  • 38
  • 2
    Although this code may help to solve the problem, it doesn't explain _why_ and/or _how_ it answers the question. Providing this additional context would significantly improve its long-term educational value. Please [edit] your answer to add explanation, including what limitations and assumptions apply. – Toby Speight Oct 24 '16 at 16:33
  • 1
    @jared chu Please check the update I have mentioned. I have mentioned 4 different strings to be parsed. – Always_a_learner Oct 25 '16 at 05:34