2

I've got a string $newstring loaded with lines that look like:

<tt>Thu 01-Mar-2012</tt> &nbsp; 7th of Atrex, 3009 <br>

I want to explode $newstring using <tt> and <br> as the delimiters.

How can I use preg_split() or anything else to explode it?

Pankit Kapadia
  • 1,579
  • 13
  • 25
  • So what have you tried so far? How did you get to these strings? – Ja͢ck Dec 27 '12 at 05:37
  • Taking your directions literally, I would expect the result to have these three pieces: _empty string_, `Thu 01-Mar-2012  7th of Atrex, 3009`, _empty string_. For the sake of clarity, what is your expected result? – Wiseguy Dec 27 '12 at 05:37
  • These strings are from a curl of a web page. I'm trying to clean up the string by exploding it into substrings that are delimited by and
    .. I'm new to regular expressions, etc. so I'm trying to get a preg_split expression that will do it.
    – Jordan Fine Dec 27 '12 at 05:42
  • Do you expect the results to have two parts thus: `Thu 01-Mar-2012` and `  7th of Atrex, 3009`? – Wiseguy Dec 27 '12 at 05:44
  • I expect it to have one part... /|
    / doesn't work ... not sure if it needs to be escaped or something
    – Jordan Fine Dec 27 '12 at 05:46
  • Explain the expected output, let's start with that. – Ja͢ck Dec 27 '12 at 05:47
  • 1
    @AndyLester, wrong site copypasta. I think you meant the http://htmlparsing.com/php.html one. – Charles Dec 27 '12 at 05:52
  • 1
    You are trying to parse HTML with regular expressions. Don't do that. Use a proper DOM parser. There are examples at http://htmlparsing.com/php.html – Andy Lester Dec 27 '12 at 05:55
  • Why can't I use regular expressions? The expected output is to take a string that's been loaded with HTML, and explode it into an array of strings that looks like the one in the original question... it's all just text in a variable, no? – Jordan Fine Dec 27 '12 at 06:05
  • 1
    @JordanFine: [read this](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) if you want to know why you shouldn't parse markup with regex :) – Elias Van Ootegem Dec 27 '12 at 06:26
  • Do note that it's just a "shouldn't" in most cases. The other (less comical) answers in that question will give you more context. Nine times out of ten, you'll want to break out the DOM. – Charles Dec 27 '12 at 06:33
  • @JordanFine why don't you just update the question with expected output ..... – Baba Dec 27 '12 at 07:08

6 Answers6

1

Alright I'm on my Nexus 7, and I've found it isn't too elegant to answer questions on a tablet, but regardless you can do this using preg_split using the following regex:

<\/?tt>|</?br>

See the regex working here: http://www.regex101.com/r/kX0gE7

PHP code:

$str = '<tt>Thu 01-Mar-2012</tt>  7th of Atrex, 3009<br>';
$split = preg_split('@<\/?tt>|</?br>@', $str);

var_export($split);

The array $split will contain:

array ( 
    0 => '', 
    1 => 'Thu 01-Mar-2012', 
    2 => ' 7th of Atrex, 3009', 
    3 => '' 
)

(See http://ideone.com/aiTi5U)

Jay
  • 18,959
  • 11
  • 53
  • 72
  • Somehow it's not displaying the new variable when I try to echo it. $newtring = preg_split('\|\<\/tt\>|\
    ', $curl_scraped_page); $curl_scraped_page is confirmed to contain data. echo $newstring doesn't display anything
    – Jordan Fine Dec 27 '12 at 05:52
  • @JordanFine You forgot the delimiters at the beginning and end of the regex. All the PHP preg_XXX functions require them. – Barmar Dec 27 '12 at 06:30
  • The forward slashes `/` are my delimiters, though I think I'll switch them for `@` because the regex itself uses forward slashes. Updating my answer, thanks. – Jay Dec 28 '12 at 15:32
0

try this code..

  <?php

 $newstring = "<tt>Thu 01-Mar-2012</tt> &nbsp;7th of Atrex, 3009<br>";

 $newstring = (explode("<tt>",$newstring));
                   //$newstring[1] store Thu 01-Mar-2012</tt> &nbsp;7th of Atrex,      3009<br>  so do opration on that.

 $newstring = (explode("<br>",$newstring[1]));
 echo $newstring[0];
?> 

output:-->

 Thu 01-Mar-2012</tt> &nbsp;7th of Atrex, 3009
Sandy8086
  • 653
  • 1
  • 4
  • 14
  • This just says "Array" as the output – Jordan Fine Dec 27 '12 at 06:01
  • I am displaying result store in $newstring . you want use result in program then output value will get $newstring[0]. – Sandy8086 Dec 27 '12 at 06:04
  • Shouldn't newstring now only include all substrings starting with and ending with
    ? So if I echo $newstring, shouldn't it display them?
    – Jordan Fine Dec 27 '12 at 06:07
  • all substring pass one by one means all subtring store in array assign to $newstring one by one using loop.inside loop use above code to display result. – Sandy8086 Dec 27 '12 at 06:16
0

You should try this code..

<?php
$keywords = preg_split("/\<tt\>|\<br\>/", "<tt>Thu 01-Mar-2012</tt> &nbsp; 7th of Atrex, 3009 <br>");
print_r($keywords);
?>

Look at the CodePad exapmle.

IF you want to include </tt> also then use.. <\/?tt>|<br>. See Example.

Ashwini Agarwal
  • 4,828
  • 2
  • 42
  • 59
0

If the <tt> and <br/> tags are the only tags in the string, a simple regex like this will do:

$exploded = preg_split('/\<[^>]+\>/',$newstring, PREG_SPLIT_NO_EMPTY);

The expression:
delimiters start and end with < and > respectively
In between these chars at least 1 [^>] is expected (this is any char except for the closing >

PREG_SPLIT_NO_EMPTY
This is a constant, passed to the preg_split function that avoids array values that are empty strings:

$newString = '<tt>Foo<br/><br/>Bar</tt>';
$exploded = preg_split('/\<[^>]+\>/',$newstring);
//output: array('','Foo','','Bar',''); or something (off the top of my head)
$exploded = preg_split('/\<[^>]+\>/',$newstring, PREG_SPLIT_NO_EMPTY);
//output: array('Foo', 'Bar')

If, however, you're dealing with more than these two tags, or variable input (as in user-supplied), you might be better off parsing the markup. Look into php's DOMDocument class, see the docs here.

PS: to see the actual output, try echo '<pre>'; var_dump($exploded); echo '</pre>';

Elias Van Ootegem
  • 74,482
  • 9
  • 111
  • 149
0
function multiExplode($delimiters,$string) {
    return explode($delimiters[0],strtr($string,array_combine(array_slice($delimiters,1),array_fill(0,count($delimiters)-1,array_shift($delimiters)))));
}

EX: $values = multiExplode(array("","
"),$your_string);

Ionut Panescu
  • 1,784
  • 1
  • 12
  • 12
-1

Here is a custom function with example.

http://www.phpdevtips.com/2011/07/exploding-a-string-using-multiple-delimiters-using-php/

Jirilmon
  • 1,924
  • 1
  • 12
  • 13