0

I need to split a PHP array into an array containing text and URLs. For instance, assuming

$string = "Hello, my name is http://www.audio.com/1234.mp3/. Today is https://radio.org/weather.wav";

The expected output should be something like:

$a[0] = "Hello, my name is";
$a[1] = "http://www.audio.com/1234.mp3/";
$a[2] = ". Today is";
$a[3] = "https://radio.org/weather.wav";

Any suggestions?

Thanks in advance,

Leo

Leo Burd
  • 189
  • 2
  • 9
  • 2
    Why do you need it like this? If you want to convert the urls to links, it's easier to use `preg_replace` – Czechnology Feb 26 '11 at 18:31
  • 1
    See http://stackoverflow.com/questions/287144/need-a-good-regex-to-convert-urls-to-links-but-leave-existing-links-alone for example. – mellamokb Feb 26 '11 at 18:33
  • My goal is to produce a function that reads text out loud, replacing text by already existing audio files whenever appropriate. For instance, `Say("hello $name!");` might produce different results whether $name is text or the url of a sound file... – Leo Burd Feb 27 '11 at 14:28

1 Answers1

1

You cannot split it easily. But a workaround would be to match it in pairs using something like:

preg_match_all('#(.*?)(https?://\S+(?<![,.]))\K#s', $str, $m,
               PREG_SET_ORDER);
$list = call_user_func_array("array_merge", $m);

The call_user_func_array is another workaround to avoid flatteing the array manually. This method will lead to empty entries in between however:

Array
(
    [0] => 
    [1] => Hello, my name is 
    [2] => http://www.audio.com/1234.mp3/
    [3] => 
    [4] =>  Today is 
    [5] => https://radio.org/weather.wav
)

Also note that the simplistic URL regex ate up the period. (Use exact character groups instead of lookbehind.)

mario
  • 144,265
  • 20
  • 237
  • 291
  • There is preg_split(), which has a few options to capture delimiters and whatnot as well. – Marc B Feb 26 '11 at 19:23
  • Thanks so much for your answer and comments, mario and Mark B! How difficult would it be to get the punctuation (!.,?;) after the url? Is that possible at all? As mentioned above, my goal is to produce a function that reads text out loud, replacing text by already existing audio files whenever appropriate. For instance, `Say("hello $name!");` might produce different results whether `$name` is text or the url of a sound file. If that's not possible, one alternative would be to require urls to always have a space after them... Would that make sense? – Leo Burd Feb 27 '11 at 14:32
  • @user362794: You could indeed use `preg_split` as Marc said. It might return the desired result without workarounds: `$list = preg_split('#(https?://\S+(?<![,.]))#', $str, -1, PREG_SPLIT_DELIM_CAPTURE);` – mario Feb 27 '11 at 14:47
  • @mario and @MarcB, Sorry for the additional questions, but what do I need to do in order to prevent punctuation such as `! ? ; :` from being included in the parsed URL? Any ideas? That's the sort of thing that might happen in situations such as `Say("Hi $audio_name_url!");` or `Say("Weather for $location: $weather.");` ... Thanks once again! – Leo Burd Feb 28 '11 at 13:14
  • @elbee: Extend the negative lookbehind assertion `(?<![,?...])`. See http://www.regular-expressions.info/lookaround.html -- Btw, only one @notification per comment works. – mario Feb 28 '11 at 13:16