5

I want to search in text for the less than sign < between dollar signs like $x<y$ and replace it by $x < y$.

I am using mathjax and less than sign causes some problems in rendering Mathjax.(See here: http://docs.mathjax.org/en/latest/tex.html#tex-and-latex-in-html-documents).

I tried $text = preg_replace("/\$(.*?)(<)(.*?)\$/","/\$$1 < $3\$/",$text) but I am not sure if this is a good solution. I am new to programming:)

Thank you for your help.

mac
  • 291
  • 3
  • 12
  • 1
    Any reason you are using backslashes instead of forward slashes? I think this may be a mistake :) – Nathan Robb Jul 28 '16 at 14:17
  • Do you only want to allow for one character between the `$` and the `<` symbol; or are you trying to check for PHP variables? Maybe https://regex101.com/r/hO7cX5/1 – chris85 Jul 28 '16 at 14:19
  • @NathanRobb you're right, edited. – mac Jul 28 '16 at 14:19
  • Have you considered something as simple as `$text = str_replace('<', '<', $text);`? – Brandon Horsley Jul 28 '16 at 14:22
  • @BrandonHorsley How does that solve mac's problem? He's not trying to replace it with an ampersand. – Nathan Robb Jul 28 '16 at 14:23
  • @chris85 I want to find all the `<` signs in the equations like `$first formula – mac Jul 28 '16 at 14:24
  • @mac what about `(?!\h)<(?!\h)` I'm also not clear why a space makes a difference. If this is for browser rendering the `<` is still going to throw off the page generation. – chris85 Jul 28 '16 at 14:27
  • @BrandonHorsley $text = str_replace('<', '<', $text); is fine but I want to replace only those `<` that are between dollar signs. – mac Jul 28 '16 at 14:28
  • if you use this one `(\$.+?)([<>])(.+?\$)` you can avoid string like: `$` (of you need that) – Giacomo Garabello Jul 28 '16 at 14:28
  • @mac Will these always be one character or no? – Nathan Robb Jul 28 '16 at 14:33
  • @NathanRobb No Maybe there are alot of `<` signs. for example $x – mac Jul 28 '16 at 14:38
  • 1
    This problem reminds me of this answer: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Brandon Horsley Jul 28 '16 at 14:53
  • 1
    How do you exactly define 'between dollar signs'? Because there we dont have different sign for opening and for closing (unlike the parentheses), everything between first and last dollar sign in document will be between. E.g. sentence1$sentence2$sentence3$sentence4$sentence5$sentence6$sentence7. Sentence2-6 is between two dollar signs, but I think, this is not exactly, what you want. – von Oak Jul 28 '16 at 14:57
  • @BrandonHorsley exactly my thoughts. Again trying regex where a parser may be needed. My advice to OP is to maybe fix the code before it's embedded into a huge page. If that's not possible, you got a real problem ... also see the provided link from Brandon ^^ – Jakumi Jul 28 '16 at 15:04
  • @vonOak each equation begins and ends by a dollar sign `$`. – mac Jul 28 '16 at 15:06
  • @mac I think, you didn't catch my thought. When you have many equations in document, how you recognize, that between two equations is or isn't also equation? Another example: something1$equation1$something2$equation2$something3. How can you say, what is equation and what isn't. Something2 begins and ends with dollar sign, but isn't equation. – von Oak Jul 28 '16 at 15:08
  • @vonOak I know and thats the reason why I asked it here. – mac Jul 28 '16 at 15:26
  • Ok, but we haven't many informations how look entire document, which you can parse. Is big, small, how many equations, how complex, how many `<` signs can be between `$` signs, how far is one equation from another and so on. Nevertheless I tried one regular expression, you find it in answers. – von Oak Jul 28 '16 at 15:40
  • @Jakumi Nearly… it's the genre of regex you _should use_ a parser for. Not _needed_. PCRE can really parse most things. [hint: see my answer below] What's more restricted is the use of substitution patterns which are much less flexible… and also preg_replace_callback is not always perfect. Especially problematic are recursive matches here... – bwoebi Jul 28 '16 at 17:21

3 Answers3

2

I edited my previous answer - now try this:

$text = preg_replace('/\$([^$< ]+)<([^$< ]+)\$/','$$1 < $2$', $text);

DEMO

n-dru
  • 9,285
  • 2
  • 29
  • 42
2

This is far too complicated to be seriously done with regex, I think...

As long as you have a fixed number of < between the $ signs, it's easy (See the answer from n-dru).

But here you are:

$output = preg_replace(<<<'REGEX'
(\$\K\s*((?:[^<$\s]+|(?!\s+[<$])\s+)*)\s*(?=(?:<(*ACCEPT)|\$|$)(*SKIP)(*F))
# \$\K => avoid the leading $ in the match
# ((?:[^<$\s]+|(?!\s+[<$])\s+)*) => up to $ or <, excluding surrounding spaces
# (?=(?:<(*ACCEPT)|\$|$)(*SKIP)(*F)) => accept matches with <, reject these without
|(?!^)<\K\s*((?:[^<$\s]+|(?!\s+[<$])\s+)*)\s*(\$|)
# (?!^) => to ensure we are inside $ ... $
# <\K => avoid the leading < in the match
|[^$]+(*SKIP)(*F)
# skip everything outside $ ... $
)x
REGEX
, " $1$2 $3", $your_input);

See also: https://regex101.com/r/fP9aG5/2

I realize, you requested for $x<y<z$ => $x < y < z$ (instead of $ x < y < z $), but this is not doable with normal replacement patterns. Would need preg_replace_callback for that:

$output = preg_replace_callback(<<<'REGEX'
(\$\K\s*((?:[^<$\s]+|(?!\s+[<$])\s+)*)\s*(?=(?:<(*ACCEPT)|\$|$)(*SKIP)(*F))
|(?!^)<\K\s*((?:[^<$\s]+|(?!\s+[<$])\s+)*)\s*(\$|)
|[^$]+(*SKIP)(*F))x
REGEX
, function($m) {
    if ($m[1] != "") return "$m[1] ";
    if ($m[3] != "") return " $m[2]$m[3]";
    return " $m[2] ";
}, $your_input);

I've tried $your_input with:

random < test
nope $ foo $ bar < a $ qux < biz $fx<hk$
$foo<bar<baz$ foo  buh < bar < baz $
$ foo $ a < z $ a < b < z $

with this preg_replace_callback, I get, as expected:

random < test
nope $ foo $ bar < a $qux < biz$fx<hk$
$foo<bar<baz$foo  buh < bar < baz$
$ foo $ a < z $a < b < z$
Community
  • 1
  • 1
bwoebi
  • 23,637
  • 5
  • 58
  • 79
  • Thank you for your answer. It works very well. Some mathematic formulas are between double dollars like $$Formula$$. Is it possible to make code work for double dollars? – mac Jul 28 '16 at 19:07
  • @mac only if you do not use single dollars in between of double dollars and your code is well-formed. In that case you can just duplicate all occurrences of `\$` to `\$\$` … The issue is that regex has no infinite lookbehind. Please, next time you ask a regex question, specify _all possible_ inputs/expected outputs (or matches). Then we can give you an adequately tailored regex solution. – bwoebi Jul 28 '16 at 19:21
0

I tried assemble this regular expression. Try it, if it suits your requirements.

$text = preg_replace("/\\$(.{1,20})<(.{1,20})\\$/", "\$$1 < $2\$", $text);

In this expression you have {1,20}, which you can use as parameter, how max long (in this case 20) can be your variable on left and right side.

von Oak
  • 823
  • 5
  • 14
  • Thank you. so this works only for equations that have 20 character befor and 20 character after `<`? – mac Jul 28 '16 at 15:46
  • Yes, exactly. If you want to change it e.g. to 50, change numbers from `{1,20}` to `{1,50}`. Generally this is `{min, max}` limitation for characters length. You have this definition in expression twice. Left is for characters before `<`. Right is for characters after `<`. – von Oak Jul 28 '16 at 15:50