2

I have a long string and would like to replace every dot "." with a question mark, however my string includes domain extensions such as .com which I would like to skip when replacing.

Is there any way I can provide an array such as (".com", ".net", ".org") of phrases to skip when replacing using str_replace() or a similar function?

Input sentence:

$string = "An example of a website would be google.com or similar. However this is not what we are looking for";

The following:

str_replace(".", "?", $string);

Produces:

An example of a website would be google?com or similar? However this is not what we are looking for

Desired output:

An example of a website would be google.com or similar? However this is not what we are looking for

I would like to provide an array of domain extensions to skip, when replacing. Such as:

$skip = array(".com",".net",".org");

and wherever those appear, don't substitute the dot with a question mark.

EDIT: Looks like I need to use a negative lookahead with preg_replace. However not sure how to put it all together: "look for a full stop that is NOT followed by COM or NET or ORG.

onlybarca6
  • 35
  • 4
  • 1
    You've tagged this with regex and preg replace. Have you tried using regex? I don't see it in your question. – M. Eriksson Apr 08 '21 at 13:19
  • I am not sure what regex to use for it, and can't find anything. – onlybarca6 Apr 08 '21 at 13:20
  • See https://stackoverflow.com/questions/2631010/a-regex-to-match-a-substring-that-isnt-followed-by-a-certain-other-substring, namely https://stackoverflow.com/a/2631107/3832970 – Wiktor Stribiżew Apr 08 '21 at 13:21
  • Have you done any real research into regex, how it works and tried anything? [regex101.com](https://regex101.com/) is a very good site for testing (and learning) regex. – M. Eriksson Apr 08 '21 at 13:21
  • @WiktorStribiżew Yes, I've already seen that link, however I'm not sure how to apply it to my needs. Looks like I need to use a negative lookahead, such as (?!.*bar) however not sure how to combine it with: Look for a full stop > that isn't followed by COM or ORG or NET. Can anyone help please? – onlybarca6 Apr 08 '21 at 13:25
  • No need using `.*`, you want to match something right after a char. BTW, [here is how to match a dot](https://stackoverflow.com/questions/13989640/regular-expression-to-match-a-dot). – Wiktor Stribiżew Apr 08 '21 at 13:26
  • I would split the string by values from the $skip array, then deleted everything that was not needed. Then I would restore the string by gluing the elements of the array using the extensions from $skip – kubarik Apr 08 '21 at 13:36

1 Answers1

1

You need

$result = preg_replace('~\.(?!(?:com|org|net)\b)~', '?', $string);

See the regex demo. Details

  • \. - a dot
  • (?! - not followed with
    • (?:com|org|net) - com, org, net substrings...
    • \b - as whole words (it is a word boundary)
  • ) - end of the negative lookahead.

NOTE: to make the TLDs match in a case insensitive way, add i after the trailing regex delimiter, here, ~i.

See a PHP demo:

$string = "An example of a website would be google.com or similar. However this is not what we are looking for";
$tlds = ['com', 'org', 'net'];
echo preg_replace('~\.(?!(?:' . implode('|', $tlds) . ')\b)~i', '?', $string);
// => An example of a website would be google.com or similar? However this is not what we are looking for
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563