-2

I am wondering whether there is a way to represent a character class that matches nothing. Could anybody let me know whether there a way? Thanks.

$ grep '[]' <<< a
grep: Unmatched [, [^, [:, [., or [=
user1424739
  • 11,937
  • 17
  • 63
  • 152
  • 4
    Can you clarify why do you need an empty character class? – anubhava Jun 17 '20 at 05:33
  • 2
    What would be the point of a character class that matches nothing? Why even try to match at all if the aim is not to match? – Grismar Jun 17 '20 at 05:36
  • Is this useful for you: https://stackoverflow.com/questions/62416394/how-to-match-nothing? – Chris Ruehlemann Jun 17 '20 at 06:23
  • I had posted an answer (that I just deleted) that suggested that `[^\s\S]` or `[^\d\D]` could be used. With most flavours of regex`[^\s\S]` means match any character other than a whitespace character and a non-whitespace character. `[^\d\D]` means match any character other than a digit and a character that is not a digit. What I didn't understand however, until @rici set me straight, is that grep treats backslashes as ordinary characters in character classes. Therefore, `[^\s\S]` matches every character other than `'\'`, `'s'` and `'S'`. – Cary Swoveland Jun 18 '20 at 04:18

2 Answers2

1

It is possible to do this in Java. Java's Pattern class allows you to create a character class that is the intersection of two other character classes. So, if I create two character classes with no common characters and I take their intersection, then I have created a character class that effectively matches nothing. Consider the following code example.

String input = "abcdefghijklmnopqrstuvwxyz";
Pattern unPattern = Pattern.compile("[a-c&&[d-f]]");
Matcher unMatcher = unPattern.matcher(input);
System.out.println("Starting matching...");
while (unMatcher.find()) {
  System.out.println("Matched " + unMatcher.group());
}
System.out.println("Ending matching.");

In the above example, I have one character class matching 'a', 'b', and 'c'. I have a second character class matching 'd', 'e', and 'f'. I intersect them using the && operator. Since there are no common characters, this regex will not match anything. That being said, I have no idea what use this might have. But it is possible.

entpnerd
  • 10,049
  • 8
  • 47
  • 68
0

Posix regexes don't offer that possibility because a ] is taken to be a literal ] if it appears immediately after the [ or [^ which starts the class. (The same is true for -.)

Note that in a Posix regex, \ does not have any special significance inside a character class, so grep -E '[\s] matches either a backslash or a lower-case s, and nothing else. (That's not very relevant to your question but it is relevant to some other answers.)

GNU grep does implement some extensions to Posix regex, including recognising some non-standard backslash escape sequences outside of character classes. (With an emphasis on some. It doesn't recognise \d, for example, which sometimes comes as a surprise.) But it's basically a Posix implementation, so while grep -E '\s' does match any line including a whitespace character, grep -E '[\s]' matches any line with either a \ or an s.

rici
  • 234,347
  • 28
  • 237
  • 341