Find the line with all caps in Regex Python

Question

I'm trying to find all lines that are all caps using regex, and so far I've tried this:

re.findall(r'\b\n|[A-Z]+\b', kaizoku)

So far my database is as follows:

TRAFALGAR LAW
You shall not be the pirate king.
MONKEY D LUFFY
Now!
DOFLAMINGO'S UNDERLINGS:
Noooooo!

I want it to return

TRAFALGAR LAW
MONKEY D LUFFY
DOFLAMINGO'S UNDERLINGS:

But it's returning something else. (Namely this:

TRAFALGAR
LAW
Y
MONKEY
D
LUFFY
N
DOFLAMINGO'
S
UNDERLINGS:
N

EDIT So far I really think the best fit for the answer is @Jan's answer

rx = re.compile(r"^([A-Z ':]+$)\b", re.M)
rx.findall(string)

EDIT2 Found out what's wrong, thanks!

Possible duplicate of [Check if string is upper, lower, or mixed case in Python](https://stackoverflow.com/questions/8222855/check-if-string-is-upper-lower-or-mixed-case-in-python) — ctwheels, Dec 06 '17 at 21:40
`DOFLAMINGO'S` has a quote in it... output & expected output in detail please ([mcve]) — Jean-François Fabre, Dec 06 '17 at 21:40
Possible duplicate: https://stackoverflow.com/q/2323988/8881141 Also no effort, when somebody like me, who doesn't no anything about regex, finds it in less than a minute. — Mr. T, Dec 06 '17 at 21:46
@Piinthesky: I tried those solutions, and it returned nothing for me. — Sunny League, Dec 06 '17 at 22:11

ctwheels · Answer 1 · 2017-12-06T21:50:16.047

Brief

No need for regex, python has the method isupper()

Return true if all cased characters^[4] in the string are uppercase and there is at least one cased character, false otherwise.

_{[4] Cased characters are those with general category property being one of “Lu” (Letter, uppercase), “Ll” (Letter, lowercase), or “Lt” (Letter, titlecase).}

Code

See code in use here

a = [
    "TRAFALGAR LAW",
    "You shall not be the pirate king.",
    "MONKEY D LUFFY",
    "Now!",
    "DOFLAMINGO'S UNDERLINGS:",
    "Noooooo!",
]

for s in a:
    print s.isupper()

Result

True
False
True
False
True
False

Jan · Accepted Answer · 2017-12-06T21:49:00.790

4

Here you go

import re

string = """TRAFALGAR LAW
You shall not be the pirate king.
MONKEY D LUFFY
Now!
DOFLAMINGO'S UNDERLINGS:
Noooooo!
"""

rx = re.compile(r"^([A-Z ':]+$)", re.M)

UPPERCASE = [line for line in string.split("\n") if rx.match(line)]
print(UPPERCASE)

Or:

rx = re.compile(r"^([A-Z ':]+$)", re.M)

UPPERCASE = rx.findall(string)
print(UPPERCASE)

Both will yield

['TRAFALGAR LAW', 'MONKEY D LUFFY', "DOFLAMINGO'S UNDERLINGS:"]

edited Dec 06 '17 at 21:49

answered Dec 06 '17 at 21:41

Jan

42,290
8
54
79

@Jean-FrançoisFabre: Very true, thanks for spotting it. Updated. – Jan Dec 06 '17 at 21:49
@Jean-FrançoisFabre: Thank you, but it doesn't do well if one line goes A SERVANT. – Sunny League Dec 06 '17 at 22:09
@Jan What does ': means? – zewill Nov 23 '21 at 00:46

Ajax1234 · Answer 3 · 2017-12-06T22:03:29.477

2

You can use [A-Z\W] to check for any uppercase letters along with non alphanumeric characters:

import re
s = ["TRAFALGAR LAW", "You shall not be the pirate king.", "MONKEY D LUFFY", "Now!", "DOFLAMINGO'S UNDERLINGS:", "Noooooo!"]
new_s = [i for i in s if re.findall('^[A-Z\d_\W]+$', i)]

Output:

['TRAFALGAR LAW', 'MONKEY D LUFFY', "DOFLAMINGO'S UNDERLINGS:"]

edited Dec 06 '17 at 22:03

answered Dec 06 '17 at 21:41

Ajax1234

69,937
8
61
102

Wouldn't `[A-Z\d_\W]` be better as it includes digits and underscore (in the case that they may be used)? – ctwheels Dec 06 '17 at 21:57

Find the line with all caps in Regex Python

3 Answers3

Brief

Code

Result

Linked