1

I am using regular expressions with preg_replace() in order to find and replace a sentence in a piece of text. The $search_string contains plain text + html tags +   elements. The problem is that only sometimes the   elements convert to white space on run time, making it difficult to find and replace using str_replace(). So, I'm trying to build a pattern that is equal to the search string and will match anything like it which contains, or does not contain the   elements;

For example:

$search_string = 'Two years in,&nbsp;the company has expanded to 35 cities, five of which are outside the U.S. Plus,&nbsp;in&nbsp;April, <a href="site.com">ClassPass</a> acquired its main competitor,&nbsp;Fitmob.';

$pattern = $search_string(BUT IGNORE THE &nbsp; elements in the subject)

$subject = "text text text text text". $search_string . "text text text text text";

Using A regular expression to exclude a word/string, I've tried:

     $pattern = '`^/(?!\&nbsp;)'.$search_string.'`';
     $output = preg_replace($pattern, $replacement_string,$subject);

The end result will be that if the $subject does contains a string that is like my $seach_string but without the &nbsp; elements, it will still match and replace it with $replacement_string

EDIT:

The actual values:

$subject = file_get_contents("http://venturebeat.com/2015/11/10/sources-classpass-raises-30-million-from-google-ventures-and-others/");

 $search_string = "Two years in,&nbsp;the company has expanded to 35 cities, five of which are outside the U.S. Plus,&nbsp;in&nbsp;April, ClassPass acquired its main competitor,&nbsp;Fitmob."; 

$replacement_string = "<span class='smth'>Two years in,&nbsp;the company has expanded to 35 cities, five of which are outside the U.S. Plus,&nbsp;in&nbsp;April, ClassPass acquired its main competitor,&nbsp;Fitmob.</span>"; 
Community
  • 1
  • 1
user3857924
  • 86
  • 3
  • 15
  • IGNORE or convert to space? – Madivad Nov 12 '15 at 12:54
  • I've tried that and it doesn't work. May be I'm wrong - the str_replace() fails not because of   but something else. It's worth mentioning that the $subject is actually a whole web page. Any other suggestions? It seems I need a regex pattern that will match any string in the subject which has, say 90% similarity to the $search_string ? – user3857924 Nov 12 '15 at 12:55
  • Please, could you give the link to the source page and the real search and replacement strings? – Casimir et Hippolyte Nov 12 '15 at 12:58
  • 1
    $subject = file_get_contents("http://venturebeat.com/2015/11/10/sources-classpass-raises-30-million-from-google-ventures-and-others/"); $search_string = "Two years in, the company has expanded to 35 cities, five of which are outside the U.S. Plus, in April, ClassPass acquired its main competitor, Fitmob."; $replacement_string = "Two years in, the company has expanded to 35 cities, five of which are outside the U.S. Plus, in April, ClassPass acquired its main competitor, Fitmob."; – user3857924 Nov 12 '15 at 13:07
  • that replacement string has the ` ` encoded into it. look to my answer below – Madivad Nov 12 '15 at 13:17
  • What's determining where you add the "smth" span class? – Madivad Nov 12 '15 at 13:44
  • also, you're using double quotes twice, ie in the replacement string and around the replacement string. – Madivad Nov 12 '15 at 13:45
  • I add the span class around all replacement strings – user3857924 Nov 12 '15 at 13:58
  • It's single quotes inside, I changed it. That's not the mistake though – user3857924 Nov 12 '15 at 14:01
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/94932/discussion-between-madivad-and-user3857924). – Madivad Nov 12 '15 at 14:02
  • 1
    Would it make anything easier to first do a `$subject = str_replace( ' ', ' ', $subject );` and the do the search when it only contains ordinary spaces? – Hasse Björk Nov 12 '15 at 22:46
  • I've tried that but it still doesnt work. For some reason, it's able to find and replace the string just before the   and the one after. But not when you combine them. This made me think it is related to   .. – user3857924 Nov 12 '15 at 22:54

1 Answers1

0

Not a very efficient way of doing it but it should workout for you,

preg_replace('Two.*?years.*?in.*?the.*?company.*?has.*?expanded.*?to.*?35.*?cities.*?five.*?of.*?which.*?are.*?outside.*?the.*?U\.S\..*?Plus.*?in.*?April.*?ClassPass.*?acquired.*?its.*?main.*?competitor.*?Fitmob\.', '<span class=\'smth\'>$0</span>', $subject);
ezio4df
  • 3,541
  • 6
  • 16
  • 31