77

There are lots of posts about regexs to match a potentially empty string, but I couldn't readily find any which provided a regex which only matched an empty string.

I know that ^ will match the beginning of any line and $ will match the end of any line as well as the end of the string. As such, /^$/ matches far more than the empty string such as "\n", "foobar\n\n", etc.

I would have thought, though, that /\A\Z/ would match just the empty string, since \A matches the beginning of the string and \Z matches the end of the string. However, my testing shows that /\A\Z/ will also match "\n". Why is that?

dreftymac
  • 31,404
  • 26
  • 119
  • 182
Peter Alfvin
  • 28,599
  • 8
  • 68
  • 106
  • There are many [SO posts](http://stackoverflow.com/questions/17618634/matching-an-empty-string-with-regex) about regex to match an empty string, so at a cursory glance it seemed like it may be a duplicate. Consider changing your title to more specifically address your issue of ignoring line breaks. – Scott Solmer Oct 02 '13 at 11:40
  • 3
    That's a post about a regex which _doesn't_ match the empty string with a set of answers as to why. I really tried and couldn't find a post about a regex which only matched an empty string, let alone one which dealt with that and the difference between `\z` and `\Z`. I don't want to clutter up SO. If you can find a question this is a dup of, I'll gladly delete this one. That said, I added emphasis to the word ONLY in this title. – Peter Alfvin Oct 02 '13 at 12:32
  • Remove the multiline flag and ^$ should work – Clay Risser Sep 14 '18 at 07:10
  • @JamRisser I understand the interaction with multi-line mode. I should have been explicit, but I'm asking about a regex to match only an empty string _in multiline mode_. Note, in particular, the last paragraph. – Peter Alfvin Sep 15 '18 at 15:55

11 Answers11

76

It's as simple as the following. Many of the other answers aren't understood by the RE2 dialect used by C and golang.

^$
johnDanger
  • 1,990
  • 16
  • 22
Clay Risser
  • 3,272
  • 1
  • 25
  • 28
  • Do you disagree with the statement, included in the question: "As such, /^$/ matches far more than the empty string such as "\n", "foobar\n\n", etc."? – Peter Alfvin Sep 12 '18 at 05:22
  • 7
    Yes I disagree with that statement. That statement is only true when the multiline flag is enabled. – Clay Risser Sep 13 '18 at 06:49
  • 1
    And it probably wouldn't hurt to make sure the global flag is disabled, because it's not possible to have multiple instances of nothing. – Clay Risser Sep 13 '18 at 06:50
  • Interesting, I'm curious about the use case – Clay Risser Sep 16 '18 at 16:03
  • 2
    `r'^\Z'` and `r'\A\Z'` only match the empty string in **Python**. `r'^$'` matches `'\n'`: "By default ... `'$'` [matches] only at the end of the string [and immediately before the newline](https://docs.python.org/3/library/re.html#re.MULTILINE) (if any) at the end of the string." – Bob Stein Sep 12 '19 at 17:15
  • 1
    This is actually the correct answer. At the JavaScript console `RegExp('^$').test('\n')` is `false` and `RegExp('^$').test('')` is `true`. The original poster must have had the multiline flag set to `true`. _i.e._ `RegExp('^$','m').test('\n')` equals `true` – Jonathan Benn Nov 20 '19 at 19:27
  • This work perfect, at least in javascript as already mentioned. Also it's very easy to create a "should be empty or follow this rule. Made a simple date format check that looks like this. /^(20[0-9][0-9]-[0-1][0-9]-[0-3][0-9]|)$/.test(''); /^(20[0-9][0-9]-[0-1][0-9]-[0-3][0-9]|)$/.test('2020-12-12'); – Griffin Mar 24 '20 at 07:56
  • This fails in Python 3.8. `re.match("^$", "\n")` returns a match object without the multiline flag being specified. – James Mchugh Jul 07 '21 at 20:59
69

I would use a negative lookahead for any character:

^(?![\s\S])

This can only match if the input is totally empty, because the character class will match any character, including any of the various newline characters.

Jonathan Benn
  • 2,908
  • 4
  • 24
  • 28
Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • 4
    Couldn't you just use `.` instead of `[\s\S]`? – mbomb007 Apr 20 '17 at 15:36
  • 9
    @mbom you could if you enabled the DOTALL flag, so dot matches newlines too, but this way it works everywhere, even if flags aren't available. – Bohemian Apr 20 '17 at 16:22
  • This does not actually work, as it matches everything, _e.g._ at the JavaScript command line `RegExp('^(?![\s\S])').test('Hello World!')` returns `true` – Jonathan Benn Nov 20 '19 at 19:19
  • 4
    @JonathanBenn This actually *does* work, if you execute it properly. In JavaScript console: `RegExp(/^(?![\s\S])/).test('') -> true`, and `RegExp(/^(?![\s\S])/).test('Hello World!') -> false` – Bohemian Nov 21 '19 at 02:07
  • 5
    @Bohemian You are right! I just learned something today... According to [Mozilla](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp) the string expression requires double backslash. I should have used `RegExp('^(?![\\s\\S])')` or a literal as you did. – Jonathan Benn Nov 21 '19 at 15:29
13

As explained in http://www.regular-expressions.info/anchors.html under the section "Strings Ending with a Line Break", \Z will generally match before the end of the last newline in strings that end in a newline. If you want to only match the end of the string, you need to use \z. The exception to this rule is Python.

In other words, to exclusively match an empty string, you need to use /\A\z/.

Peter Alfvin
  • 28,599
  • 8
  • 68
  • 106
6

^$ -- regex to accept empty string.And it wont match "/n" or "foobar/n" as you mentioned. You could test this regex on https://www.regextester.com/1924.

If you have your existing regex use or(|) in your regex to match empty string. For example /^[A-Za-z0-9&._ ]+$|^$/

Suhas Gholap
  • 69
  • 1
  • 2
4

Try looking here: https://docs.python.org/2/library/re.html

I ran into the same problem you had though. I could only build a regex that would match only the empty string and also "\n". Try trimming/replacing the newline characters in the string with another character first.

I was using http://pythex.org/ and trying weird regexes like these:

()

(?:)

^$

^(?:^\n){0}$

and so on.

mbomb007
  • 3,788
  • 3
  • 39
  • 68
4

I believe Python is the only widely used language that doesn't support \z in this way (yet). There are Python bindings for Russ Cox / Google's super fast re2 C++ library that can be "dropped in" as a replacement for the bundled re.

There's an excellent discussion (with workarounds) for this at Perl Compatible Regular Expression (PCRE) in Python, here on SO.

python
Python 2.7.11 (default, Jan 16 2016, 01:14:05) 
[GCC 4.2.1 Compatible FreeBSD Clang 3.4.1 on freebsd10
Type "help", "copyright", "credits" or "license" for more information.
>>> import re2 as re
>>> 
>>> re.match(r'\A\z', "")
<re2.Match object at 0x805d97170>

@tchrist's answer is worth the read.

Community
  • 1
  • 1
G. Cito
  • 6,210
  • 3
  • 29
  • 42
2

The answer may be language dependent, but since you don't mention one, here is what I just came up with in js:

 var a = ['1','','2','','3'].join('\n');

 console.log(a.match(/^.{0}$/gm)); // ["", ""]

 // the "." is for readability. it doesn't really matter
 a.match(/^[you can put whatever the hell you want and this will also work just the same]{0}$/gm)

You could also do a.match(/^(.{10,}|.{0})$/gm) to match empty lines OR lines that meet a criteria. (This is what I was looking for to end up here.)

I know that ^ will match the beginning of any line and $ will match the end of any line

This is only true if you have the multiline flag turned on, otherwise it will only match the beginning/end of the string. I'm assuming you know this and are implying that, but wanted to note it here for learners.

Cory Mawhorter
  • 1,583
  • 18
  • 22
0

Based on the most-approved answer, here is yet another way:

var result = !/[\d\D]/.test(string);  //[\d\D] will match any character
JohnP2
  • 1,899
  • 19
  • 17
0

As @Bohemian and @mbomb007 mentioned before, this works AND has the additional advantage of being more readable:

console.log(/^(?!.)/s.test("")); //true

paulocleon
  • 109
  • 9
0

Another possible answer considering also the case that an empty string might contain several whitespace characters for example spaces,tabs,line break characters can be the folllowing pattern.

pattern = r"^(\s*)$"

This pattern matches if the string starts and ends with zero or more whitespace characters.

It was tested in Python 3

inpap
  • 365
  • 3
  • 12
0

You are not asking about the empty string. A string in regex is not a grouping of letters, numbers, and punctuation. It is a grouping of ASCII characters. So a "\n" is not an empty string. It has an ASCII character "\n" in it. link

bob
  • 21