Regular Expression to match only alphabetic characters

Question

I was wondering If I could get a regular expression which will match a string that only has alphabetic characters, and that alone.

Is `à` an alphabetic character according to your definition? What language are you using? — Tim Pietzcker, May 20 '11 at 06:18
One important note: you didn't refered a language or tool where you wwant to use the regex you're asking. Altough the principles of the regexes are the same universally, the syntax is not equally everywhere. You should refer where you want to use it. — sergiol, May 20 '11 at 10:01
Define what do you mean by 'alphabetic character'. A string いただきます contains only alphabetic characters. Should it pass ? — koryakinp, Nov 01 '17 at 01:55

score 255 · Accepted Answer · edited Dec 04 '19 at 12:02

255

You may use any of these 2 variants:

/^[A-Z]+$/i
/^[A-Za-z]+$/

to match an input string of ASCII alphabets.

[A-Za-z] will match all the alphabets (both lowercase and uppercase).
^ and $ will make sure that nothing but these alphabets will be matched.

Code:

preg_match('/^[A-Z]+$/i', "abcAbc^Xyz", $m);
var_dump($m);

Output:

array(0) {
}

Test case is for OP's comment that he wants to match only if there are 1 or more alphabets present in the input. As you can see in the test case that matches failed because there was ^ in the input string abcAbc^Xyz.

Note: Please note that the above answer only matches ASCII alphabets and doesn't match Unicode characters. If you want to match Unicode letters then use:

/^\p{L}+$/u

Here, \p{L} matches any kind of letter from any language

edited Dec 04 '19 at 12:02

Shofol

693
1
11
26

answered May 20 '11 at 04:53

anubhava

761,203
64
569
643

3

instead of using ```[A-Za-z]``` use ```[A-z]``` – chris c Nov 04 '20 at 00:04
11

Nope `[A-z]` is wrong as it matches many other characters not just `[A-Za-z]` – anubhava Nov 04 '20 at 06:10
Hmm interesting I did not know that. Is that other languages? Thanks :) – chris c Nov 04 '20 at 22:01
1

If you look up an ASCII table you will see the characters between Z and a – Alexander Wu Feb 14 '21 at 03:21
2

I knew that `\p{L}` [works in Java](https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html) but I didn't know that [it's universal](https://regex101.com/r/SlQvEs/1) until I came across this answer. Thanks for sharing this knowledge. – Arvind Kumar Avinash Apr 13 '21 at 10:21

Tim Pietzcker · Answer 2 · 2011-05-20T09:35:31.153

72

If you need to include non-ASCII alphabetic characters, and if your regex flavor supports Unicode, then

\A\pL+\z

would be the correct regex.

Some regex engines don't support this Unicode syntax but allow the \w alphanumeric shorthand to also match non-ASCII characters. In that case, you can get all alphabetics by subtracting digits and underscores from \w like this:

\A[^\W\d_]+\z

\A matches at the start of the string, \z at the end of the string (^ and $ also match at the start/end of lines in some languages like Ruby, or if certain regex options are set).

edited May 20 '11 at 09:35

answered May 20 '11 at 06:22

Tim Pietzcker

328,213
58
503
561

55

+1 for not considering the English alphabet as the only alphabet – srcspider Jul 21 '12 at 14:00
9

+1, same as above. english is not the only alphabet and many people write their name using non-ascii characters to express it correctly. – Ben Barkay Apr 07 '13 at 07:36

score 25 · Answer 3 · answered May 20 '11 at 04:53

25

This will match one or more alphabetical characters:

/^[a-z]+$/

You can make it case insensitive using:

/^[a-z]+$/i

or:

/^[a-zA-Z]+$/

answered May 20 '11 at 04:53

stevecomrie

2,423
20
28

2

This will match only latin characters. – quotesBro Oct 04 '17 at 07:15

score 16 · Answer 4 · answered May 20 '13 at 21:10

16

In Ruby and other languages that support POSIX character classes in bracket expressions, you can do simply:

/\A[[:alpha:]]+\z/i

That will match alpha-chars in all Unicode alphabet languages. Easy peasy.

More info: http://en.wikipedia.org/wiki/Regular_expression#Character_classes http://ruby-doc.org/core-2.0/Regexp.html

answered May 20 '13 at 21:10

jshkol

1,777
14
19

1

And to get everything but those characters (which wasn't documented) use `[^[:alpha]]`. – spyle Sep 25 '14 at 18:07

score 8 · Answer 5 · answered May 20 '11 at 04:54

8

[a-zA-Z] should do that just fine.

You can reference the cheat sheet.

answered May 20 '11 at 04:54

Frazell Thomas

6,031
1
20
21

5

yes but I would also if my string contained a non word character it would still match – Steffan Harris May 20 '11 at 05:16

score 0 · Answer 6 · answered Apr 23 '23 at 12:33

0

For me this worked: Press ctrl + H in Notepad++ and find

([a-z ]+)

replace

"\1"

Turn on Regular Expression

answered Apr 23 '23 at 12:33

Manav Patadia

848
7
12

Regular Expression to match only alphabetic characters

6 Answers6

Linked

Related