0

I'm trying to make a conditional regex, I know that there are other posts on stack overflow but there too specific to the problem.


The Question

How can I create a regular expression that only looks to match something given a certain condition?


An example

An example of this would be if we had a list of a string(this is in java):

String nums = "42 36 23827";

and we only want to match if there are the same amount of x's at the end of the string as there are at the beginning

What we want in this example

In this example, we would want a regex that checks if there are the same amount of regex's at the end as there are in the beginning. The conditional part: If there are x's at the beginning, then check if there are that many at the end, if there are then it is a match.


Another example

An example of this would be if we had a list of numbers (this is in java) in string format:

String nums = "42 36 23827";

and we want to separate each number into a list

String splitSpace = "Regex goes here";
Pattern splitSpaceRegex = Pattern.compile(splitSpace);
Matcher splitSpaceMatcher = splitSpaceRegex.matcher(text);
ArrayList<String> splitEquation = new ArrayList<String>();

while (splitSpaceMatcher.find()) {
    if (splitSpaceMatcher.group().length() != 0) {
        System.out.println(splitSpaceMatcher.group().trim());
        splitEquation.add(splitSpaceMatcher.group().trim());
    }
}

How can I make this into an array that looks like this:

["42", "36", "23827"]

You could try making a simple regex like this:

String splitSpace = "\\d+\\s+";

But that exludes the "23827" because there is no space after it. and we only want to match if there are the same amount ofx`'s at the end of the string as there are at the beginning

What we want in this example

In this example, we would want a regex that checks if it is the end of the string; if it is then we don't need the space, otherwise, we do. As @YCF_L mentioned we could just make a regex that is \\b\\d\\b but I am aiming for something conditional.


Conclusion

So, as a result, the question is, how do we make conditional regular expressions? Thanks for reading and cheers!

BeastCoder
  • 2,391
  • 3
  • 15
  • 26

3 Answers3

2

I would like to use split which accept regex like so :

String[] split = nums.split("\\s+"); // ["42", "36", "23827"]

If you want to use Pattern with Matcher, then you can use String \b\d+\b with word boundaries.

String regex = "\\b\\d+\\b";

By using word boundaries, you will avoid cases where the number is part of the word, for example "123 a4 5678 9b" you will get just ["123", "4578"]

Youcef LAIDANI
  • 55,661
  • 15
  • 90
  • 140
  • You might want to leave one more sentance on what boundaries are. – BeastCoder Jan 18 '20 at 23:33
  • @BeastCoder check this https://stackoverflow.com/questions/1324676/what-is-a-word-boundary-in-regexes and https://www.regular-expressions.info/wordboundaries.html for word boundary – Youcef LAIDANI Jan 18 '20 at 23:35
  • I have added one more example. Thanks for your super-useful response – BeastCoder Jan 18 '20 at 23:43
  • @BeastCoder I don't know what you don't understand in my question, check this demo here https://ideone.com/QYVl65, it gives you the same output you expect – Youcef LAIDANI Jan 18 '20 at 23:49
  • No, I understand it all I just added one more example because Turing said that I needed to demonstrate a more conditional example, and also I don’t know who downvoted, I’ll upvote it though. – BeastCoder Jan 18 '20 at 23:52
  • Thank you @Turing85 I think my answer is short, and your answer is complete version, even this I don't know why people dv without any reason – Youcef LAIDANI Jan 18 '20 at 23:54
2

I do not see the "conditional" in the question. The problem is solvable with a straight forward regular expression: \b\d+\b.

regex101 demo

A fully fledged Java example would look something like this:

import java.util.ArrayList;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

class Ideone {
    public static void main(String args[]) {
        final String sample = "123 45 678 90";
        final Pattern pattern = Pattern.compile("\\b\\d+\\b");
        final Matcher matcher = pattern.matcher(sample);
        final ArrayList<String> results = new ArrayList<>();
        while (matcher.find()) {
            results.add(matcher.group());
        }
        System.out.println(results);
    }
}

Output: [123, 45, 678, 90]

Ideone demo

Turing85
  • 18,217
  • 7
  • 33
  • 58
  • Do you think I should change the example? I mean I want to figure out conditionals. Your answer is however helpful, but maybe you could include a solution that checks if it is the end of the String and also include this answer. – BeastCoder Jan 18 '20 at 23:28
  • I do not fully get your point. For end of String, use `$`. For start of String, use `^`. Please try and construct an example with a "conditional". – Turing85 Jan 18 '20 at 23:30
  • I will change the example, but what I mean is doing something like: Check if it is at the end if a space is not needed at the end, otherwise if it is not the end we do need a space. – BeastCoder Jan 18 '20 at 23:32
  • No! Please do not change the question. This would invalidate the answers. There are optional groups, e. g. [this regex](https://regex101.com/r/ESgxR1/3/) only allows `@` as last character of a `String`. – Turing85 Jan 18 '20 at 23:34
  • I am going to add another example – BeastCoder Jan 18 '20 at 23:40
  • Please, before you keep editing the question, try to come up with an example that represents your use case. And then open a new post. – Turing85 Jan 18 '20 at 23:41
  • Would you say that this is okay now? Your answer is still correct, there is just one more example – BeastCoder Jan 18 '20 at 23:43
  • Yes, but it is still nothing "conditional". I would suggest reading a tutorial on regular expressions, e.g. [this one](https://www.regular-expressions.info/tutorial.html). Regex'es work a little bit different than "ordinary programming". Maybe the question will resolve itself through this. – Turing85 Jan 18 '20 at 23:53
  • @YCF-L aright thanks I might change the title of my question later, thank you. – BeastCoder Jan 18 '20 at 23:54
2

There are no conditionals in Java regexes.

I want a regex that checks if there are the same amount of regex's at the end as there are in the beginning. The conditional part: If there are x's at the beginning, then check if there are that many at the end, if there are then it is a match.

This may or may not be solvable. If you want to know if a specific string (or pattern) repeats, that can be done using a back reference; e.g.

   ^(\d+).+\1$

will match a line consisting of an arbitrary number digits, any number of characters, and the same digits matched at the start. The back reference \1 matches the string matched by group 1.

However if you want the same number of digits at the end as at the start (and that number isn't a constant) then you cannot implement this using a single (Java) regex.

Note that some regex languages / engines do support conditionals; see the Wikipedia Comparison of regular-expression engines page.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • Thanks you for your answer, it is nice to know that it isn’t possible to do in Java. Thanks a lot for your answer, I’m sure it will help all that view this question. – BeastCoder Jan 19 '20 at 00:09