1

I searched for an answer but I couldn't find a clear one. Please, bear with me as I'm kind of a noob in regex, and this is my first question too. I'm using Python 3, but will also be needing this for Javascript too.

What I'm trying to do is validate an input by the user. The input is an inequality (spaces removed), and the variables are named by the user and given beforehand. For example, let's say I have this inequality:

x+y+6p<=z+1

The variables x, y, p, z will be given. The problem now if the inequality is like this:

xp+yp+6p<=z+1

The given variables are xp, yp, p, and z. I'm trying to write a regular expression to match any inequality with such a format, given no spaces in the inequality. I cannot figure out how to check for alternative strings. For example I wrote the following expression:

^([\+\-]?[0-9]*([xpypz]|[0-9]+))+[<>]=([\+\-]?[0-9]*([xpypz]|[0-9]+))+$

I know this is completely wrong and that's not how the parentheses are used, but I don't have a feasible expression and I wanted to show you what I want to achieve. Now I need to know three things (at least, I hope) to fix it:

  1. How to check specifically for xp, and yp as they are literally instead of all characters in the set xypz?
  2. How to make 0-9 after xpypz work as [0-9]+? Meaning that any number can occur instead of a variable?
  3. How can I repeat make the whole group repeated

I'm trying to write this expression to check if the user is adding undeclared variables. I believe this can be done differently without using regex, but it would be nice to do it in a single line. Can you please help me figure out those three point? Thanks.

Abzollo
  • 13
  • 5
  • Can we assume that numerals always come before the variable or can they be on either side? So is `p6<5` valid as well as `6p<5` or is only the latter valid syntax for your needs? – Alejandro Mar 04 '15 at 02:46
  • Also, do you need this one regex to check for correct syntax, like no two +'s in arrow, or just check variables used are given. – Alejandro Mar 04 '15 at 02:52
  • 1
    I'd strongly recommend not doing this with regular expressions! This question should help: http://stackoverflow.com/questions/594266/equation-parsing-in-python – Greg Ball Mar 04 '15 at 02:55
  • Yes Alejandro, they can be on both sides, so p6 and 6p6 are both valid. I agree with @Greg, It is not going to be efficient. I thought there was a simple way for doing this with regex. I'll try parsing. Thank you both for the advice. – Abzollo Mar 04 '15 at 15:50

2 Answers2

0

try this pattern

(^(?=.)(?:(?:[+-]?\d*(?:xp|yp|p|z)*)+)[<>]=(?=.)(?:(?:[+-]?\d*(?:xp|yp|p|z)*)+)$)  

Demo

alpha bravo
  • 7,838
  • 1
  • 19
  • 23
0
[0-9]*(xp|yp|p|z)*([+-][0-9]*(xp|yp|p|z)*)*(<|>|<=|>=)[0-9]*(xp|yp|p|z)*([+-][0-9]*(xp|yp|p|z)*)*

This is ugly and won't catch mistakes like 1++x<p nor does it allow for other functions like sin or exponents. It matches on xp+yp+6p<=z+1 but does not on xp+yp+6x<=z+1 if xp, yp, p, and z are the variables given.

As Greg Ball mentioned, though, the best thing would be to use parsing if possible. Then you could catch more syntax errors besides using wring variables and you could do so more reliably.

Alejandro
  • 1,168
  • 8
  • 11