0

I would like to verify the syntax of an input field with a regex. The field should accept text like the following examples:

Something=Item1,Item2,Item3
someOtherThing=Some_Item

There has to be a word, a = sign and a list of comma separated words. The list must contain at least one entry. So abc= should be invalid, but abc=123 is valid.

I am using a framework which allows a regular expression (Java) to mark the input field as valid or invalid. How can I express this rule in a regex?

With the aid of https://stackoverflow.com/a/65244969/7821336, I am able to validate the comma separated list. But as soon as I prepend my stuff with the assignment, the regex does not work any longer:

(\w+)=((?:\w+)+),?   // does not work!
mre
  • 53
  • 10
  • I have to process the input text anyway with more complex logic in the backend. So I don't care if there are empty values or not. I prefer a cleaner and easier to read (and easier to understand) regex. So be free to allow or deny empty values based on "easy first". – mre May 30 '22 at 14:09
  • Like `^(\w+)=(\w+)(,\w+)*$` ? – mre May 30 '22 at 14:12
  • You are right. I had a different approach in my mind and wanted to use the same regex in the backend. But while asking my question, I realized that is much easier to cut the string at the `=` sign, then at the `,` signs in the backend. So there is really no need for capture groups any more. – mre May 30 '22 at 14:18
  • Must the first part be only letters, or can numbers be used too. Ie is `123=abc` valid? – Bohemian May 30 '22 at 15:29
  • First part must be `\w`. So `123=abc` is valid. – mre May 31 '22 at 14:10

3 Answers3

2

You are not repeating the comma in the group, that is why it does not work when having multiple comma separated values.

If you want to get separate matches for the key and the values, you can use the \G anchor.

(?:^(\w+)=|\G(?!^))(\w+)(?:,|$)

Explanation

  • (?: Non capture group
    • ^(\w+)= Assert start of string and capture 1+ word chars in group 1
    • | Or
    • \G(?!^) Assert the postion at the end of the previous match, not at the start
  • ) Close non capture group
  • (\w+) Capture group 2, match 1+ word characters
  • (?:,|$) Match either , or assert end of string

Regex demo | Java demo

For example:

String regex = "(?:^(\\w+)=|\\G(?!^))(\\w+)(?:,|$)";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
String[] strings = {"Something=Item1,Item2,Item3", "someOtherThing=Some_Item", "Something="};

for (String s : strings) {
    Matcher matcher = pattern.matcher(s);

    while (matcher.find()) {
        String gr1 = matcher.group(1);
        String gr2 = matcher.group(2);

        if (gr1 != null) {
            System.out.println("Group 1: " + gr1);
        }
        if (gr2 != null) {
            System.out.println("Group 2: " + gr2);
        }
    }
}

Output

Group 1: Something
Group 2: Item1
Group 2: Item2
Group 2: Item3
Group 1: someOtherThing
Group 2: Some_Item
The fourth bird
  • 154,723
  • 16
  • 55
  • 70
1

I used this code, but it does not use any regex. Code:

import java.util.*;

public class MyClass {

    public static void main(String[] args) {
        String something1 = "Something=Item1,Item2,Item3";
        String something2 = "Something=";
        String something3 = "Something";
        String something4 = "=Item1,Item2,Item3";
        
        System.out.println(isValid(something1));
        System.out.println(isValid(something2));
        System.out.println(isValid(something3));
        System.out.println(isValid(something4));
    }
    
    public static boolean isValid(String string) {
        
        boolean checkPart1Correct = string.contains("="); // check if it has = sign
        if(!checkPart1Correct) return false;
        
        //now we will split and see it it has items and the text before the = sign is not empty
        String[] partsOfString = string.split("=");
        if(partsOfString[0].trim().isEmpty()) return false;
        try {
            if(partsOfString[1] == null) return false;
        }catch(Exception e) {
            return false;
        }
        if(partsOfString[1] == null) return false;
        String[] items = partsOfString[1].split(",");
        if(items.length == 0) return false;
        
        //now, we will make the items into a list, and then you can do whatever you want
        List<String> itemsList = Arrays.asList(items);
        
        //you can do whatever you want with that list
        
        return true;        
    }
}

After testing it here, you can see it in action. Also, these are the checks done in this code:

  1. It will check if the text before the = sign is not empty.
  2. It will check if it has the = sign.
  3. It will check if the items are not empty
  4. It will also give us the list of the items in that list.
Sambhav Khandelwal
  • 3,585
  • 2
  • 7
  • 38
1

Try this regex:

\w+=\w+(,\w+)*

which is used like this in Java:

if (input.matches("\\w+=\\w+(,\\w+)*")) {
    // input is OK
}

If the first part should not have numbers, use this instead:

[a-zA-Z_]+=\w+(,\w+)*

Or if just the first character should not be a number (ie it should be a valid Java variable name), use this:

[a-zA-Z_]\w*=\w+(,\w+)*
Bohemian
  • 412,405
  • 93
  • 575
  • 722