-1

The structure I want the regex to validate in looks similar to this:

keyword input(param_input,param_input2)

The "keyword" section needs to be built into the regex so it's not considered an input.

The input, param_input and param_input2 can be letters(including caps), numbers and _, but none of these inputs can be entirely numbers.

What would be the best approach for writing a regex string to validate a string?

Attempt:

function+\s+[a-zA-Z0-9]+((?!.*[,]{2})(?!.*[,0-9]{2})(?!^$,{1})[a-zA-Z0-9,]*)
Ryszard Czech
  • 18,032
  • 4
  • 24
  • 37
Programing
  • 27
  • 5
  • Can you please share your attempt? – hitesh bedre Sep 07 '21 at 16:04
  • A warning: while a regex for simple comma-separated tokens is easy and can work well, things get much more complicated if you also want to allow strings that include a comma, like in `keyword myinput(myparam1, "string1,string2")`. In that case, a regex is not the ideal solution. – cornuz Sep 07 '21 at 16:10
  • @cornuz I don't at the moment want to allow " in the inputs – Programing Sep 07 '21 at 20:09
  • @hiteshbedre function+\s+[a-zA-Z0-9]+\((?!.*[,]{2})(?!.*[,0-9]{2})(?!^$,{1})[a-zA-Z0-9,]*\) – Programing Sep 07 '21 at 20:10
  • Assuming you're matching code, most languages don't allow identifiers to start with a digit, so the terms would be either all digits `\d+` or `(?=\D)\w+` – Bohemian Sep 08 '21 at 05:22

1 Answers1

2

You can use negative lookahead assertions to make sure that the characters you are capturing isn't purely digits. Try this:

^[\w]+ (?!\d+\()([\w]+)\((?:(?!\d+[,\)])(\w+)),(?:(?!\d+[,\)])(\w+))\)$

Demo here.

keyword input(param_input,param_input2)  -> Will match
keyword 12345(param_input,param_input2)
keyword in123(abc123_DEF4,a1_b2_C3_d45)  -> Will match
keyword input(12345678901,param_input2)
keyword input(param_input,123456789012)
keyword input(12345678901,123456789012)
keyword XYZ98(aaaa1111BB2,CCCCCCC33333)  -> Will match
keyword XYZ98(aaaa1111BB2,CCCCCCC33333,DD44)

Where:

  • ^[\w]+ - Match the initial string e.g. keyword
  • (?!\d+\()([\w]+)\( - Using a negative lookahead assertion, match the next part until ( only if it isn't purely digits e.g. input( but not 12345(
  • (?:(?!\d+[,\)])(\w+)), - Again using a negative lookahead assertion, match the next part until , only if it isn't purely digits e.g. param_input, but not 12345678901,
  • (?:(?!\d+[,\)])(\w+))\)$ - Again using a negative lookahead assertion, match the next part until the ending ) only if it isn't purely digits e.g. param_input2) but not 123456789012))

This is fixed to 2x input parameters e.g. (param_input,param_input2). Note that if you wish to accept variable amount of input parameters e.g. (param1,param2,param3,...,paramN), you can't easily do it with regex groups as explained in this answer from another thread.

  • What you can do instead is manually define this (?:(?!\d+[,\)])(\w+)), for every additional parameter.

References: