checks the text for the presence of 2 or more characters or digits surrounded by parentheses, with at least the first character in uppercase

Question

The contains_acronym function checks the text for the presence of 2 or more characters or digits surrounded by parentheses, with at least the first character in uppercase (if it's a letter), returning True if the condition is met, or False otherwise. For example, "Instant messaging (IM) is a set of communication technologies used for text-based communication" should return True since (IM) satisfies the match conditions. Fill in the regular expression in this function:

import re

def contains_acronym(text):
    pattern = ___ 
    result = re.search(pattern, text)
    return result != None

print(contains_acronym("Instant messaging (IM) is a set of communication technologies used for text-based communication")) # True
print(contains_acronym("American Standard Code for Information Interchange (ASCII) is a character encoding standard for electronic communication")) # True
print(contains_acronym("Please do NOT enter without permission!")) # False
print(contains_acronym("PostScript is a fourth-generation programming language (4GL)")) # True
print(contains_acronym("Have fun using a self-contained underwater breathing apparatus (Scuba)!")) # True

I have tried with this pattern but it is not working with all given input cases:

pattern = r"\(([A-Z0-9_]+)\)"

What have *you tried*, and what exactly is the problem with it? — jonrsharpe, Sep 20 '20 at 11:42
need to find regex pattern for above problem statement and to fulfill output result for given sample inputs and @jonrsharpe is anything wrong in this question? — Vinod, Sep 20 '20 at 11:48
No, you need to *write* a regex pattern for the above problem statement. It's your homework, not ours, you can't just dump it on SO. Maybe start with e.g. https://stackoverflow.com/q/4736/3001761. — jonrsharpe, Sep 20 '20 at 11:48

Vinod · Answer 1 · 2020-09-20T17:07:21.827

Finally tried with below pattern and it covers all above scenarios with below code,

  import re
  def contains_acronym(text):
  pattern = r"\([A-Za-z0-9]{2,}\)"
  result = re.search(pattern, text)
  return result != None

print(contains_acronym("Instant messaging (IM) is a set of communication technologies used for text-based communication")) # True
print(contains_acronym("American Standard Code for Information Interchange (ASCII) is a character encoding standard for electronic communication")) # True
print(contains_acronym("Please do NOT enter without permission!")) # False
print(contains_acronym("PostScript is a fourth-generation programming language (4GL)")) # True
print(contains_acronym("Have fun using a self-contained underwater breathing apparatus (Scuba)!")) # True

How about `blahblah (R) blahblah` - your pattern will match it, however it's **not** 2 or more chars in round brackets... — Grzegorz Skibinski, Sep 20 '20 at 16:14

score 2 · Answer 2 · edited Dec 26 '22 at 05:04

2

Use this as pattern

pattern = r"\(\w.*\w\)"

"w" means alphabet and number.

edited Dec 26 '22 at 05:04

Gino Mempin

25,369
29
96
135

answered May 26 '21 at 17:51

shihab sikder

21
2

score 0 · Answer 3 · edited Apr 19 '21 at 23:54

0

Your regex is almost correct. You are just forgetting it has to have at least 2 so just make your first range a constant part of the string and repeat the same match with lowercase with + (one or more):

pattern = r"\(([A-Z0-9_][A-Za-z0-9_]+)\)"

edited Apr 19 '21 at 23:54

Gino Mempin

25,369
29
96
135

answered Oct 09 '20 at 10:03

user8547404

1

score 0 · Answer 4 · answered Dec 19 '20 at 12:38

import re
def contains_acronym(text):
  pattern = r'\([A-Za-z0-9]{2,}\)' 
  result = re.search(pattern, text)
  return result != None

print(contains_acronym("Instant messaging (IM) is a set of communication technologies used for text-based communication")) # True
print(contains_acronym("American Standard Code for Information Interchange (ASCII) is a character encoding standard for electronic communication")) # True
print(contains_acronym("Please do NOT enter without permission!")) # False
print(contains_acronym("PostScript is a fourth-generation programming language (4GL)")) # True
print(contains_acronym("Have fun using a self-contained underwater breathing apparatus (Scuba)!")) # True

score 0 · Answer 5 · answered Apr 19 '21 at 14:57

0

import re
def contains_acronym(text):
  pattern = r"\([A-Z0-9][A-Z0-9a-z]+\)"
  result = re.search(pattern, text)
  return result != None

answered Apr 19 '21 at 14:57

SRS

1

score 0 · Answer 6 · answered Feb 25 '22 at 17:36

0

This should work

 pattern = r"\([A-Z0-9].*\)"

Note that

'+' is a metacharacter which matches one or more times.
Should use '*'.
Also double brackets are not required.
\w also includes lowercase characters along with uppercase and integers.

answered Feb 25 '22 at 17:36

shaviz soudager

1
1

LimboKid · Answer 7 · 2022-04-15T12:52:58.810

0

pattern = r"\(\w{2,}\)"

This pattern works and looks better.

edited Apr 15 '22 at 12:52

answered Apr 14 '22 at 12:25

LimboKid

1
2

score 0 · Answer 8 · edited Dec 26 '22 at 05:03

0

pattern = r"\([A-Z0-9]\)*"

This one works for me. Pretty simple.

edited Dec 26 '22 at 05:03

Gino Mempin

25,369
29
96
135

answered Jun 15 '22 at 04:32

Michael L.

1

you regex says, that there may be only uppercase characters within the parantheses. But after the first `[A-Z0-0]` there may be other characters. Then there must be at least 2 ore more. `*` say "0 or more"... – Andy A. Jun 15 '22 at 06:26
Welcome to Stack Overflow! Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the [help center](https://stackoverflow.com/help/how-to-answer). – Ethan Jun 17 '22 at 13:50

user22366234 · Answer 9 · 2023-09-01T21:21:05.693

0

Escape the opening and closing brackets, define the first character to be a number or uppercase characters, then subsequent characters as uppercase, lowercase, or numeric. I use a * after this to show this can be repeated as many times or not at all. i.e. you'll have the correct output with (IM) and with (I) as well.

pattern = "\([A-Z0-9][A-Za-z0-9]*\)"

edited Sep 01 '23 at 21:21

answered Sep 01 '23 at 21:20

user22366234

1
1

checks the text for the presence of 2 or more characters or digits surrounded by parentheses, with at least the first character in uppercase

9 Answers9

This should work