9

I need to find a regex to match each sentence whether it's following Title Case or not (first letter of each word of the sentence should be in upper case and the words can can contain special characters as well).

krish
  • 93
  • 1
  • 1
  • 4
  • 2
    Hi! Welcome on SO. What have you tried so far ? What is the error message you're getting that is blocking you ? Please include all material that describes the problem and shows your efforts. – Stephane Rolland Apr 11 '16 at 15:52
  • tried this ([A-Z][\w-]*(\s+[A-Z][\w-]*)+)..but not working as expected..i am a novice in regex pattern coding – krish Apr 11 '16 at 15:55

3 Answers3

4

regex101

([A-Z][^\s]*)

Regular expression visualization

Debuggex Demo


Description

1st Capturing group ([A-Z][^\s]*)  
    [A-Z] match a single character present in the list below  
        A-Z a single character in the range between A and Z (case sensitive)
    [^\s]* match a single character not present in the list below
        Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
        \s match any white space character [\r\n\t\f ]
g modifier: global. All matches (don't return on first match)

Full Sentence

^(?:[A-Z][^\s]*\s?)+$

Regular expression visualization

Debuggex Demo

Description

^ assert position at start of the string
(?:[A-Z][^\s]*\s?)+ Non-capturing group
    Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
    [A-Z] match a single character present in the list below
        A-Z a single character in the range between A and Z (case sensitive)
    [^\s]* match a single character not present in the list below
        Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
        \s match any white space character [\r\n\t\f ]
    \s? match any white space character [\r\n\t\f ]
        Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
$ assert position at end of the string
abc123
  • 17,855
  • 7
  • 52
  • 82
  • reg_match function not working with the expression ([A-Z][^\s]+) for the string 'I Love To Work' – krish Apr 11 '16 at 15:52
  • @krish, click on the regex101 link, the example shows the example string "I Love To Work" understand the regex above is just a single word not a sentence...i will add sentence code as it appears you are wanting it. Edit: ADDED SENTENCE – abc123 Apr 11 '16 at 16:11
  • Thanks @abc123 for the solution and for the detailed explanation. – krish Apr 12 '16 at 04:56
  • 2
    matched THIS IS NOT TITLE CASE :-/ – CpILL Oct 31 '16 at 08:05
  • 1
    @CpILL that isn't a requirement by OP, I can see why you think that is wrong. But all the OP said is that each word starts with a capital letter (1st letter) and that they can contain special characters. – abc123 Nov 02 '16 at 13:38
4

This works for me:

It groups all the Title Case Words Together. Useful for matching, say, a list of People's Names

(?:[A-Z][a-z]+\s?)+

Python Examples:

# Example 1
text = "WANTED"
re.findall(r'(?:[A-Z][a-z]+\s?)+', text, re.M)
>>> []  # Does not pass

# Example 2
text = "This is a Test. This Is Another Test"
re.findall(r'(?:[A-Z][a-z]+\s?)+', text, re.M)
>>> ['This ', 'Test', 'This Is Another Test']  # Group of Title Case Phrases

If you only want a list of all the individual Title Case words use this:

'(?:[A-Z][a-z]+)'

Python Example:

# Example 1
import re
text = "This is a Test. This Is Another Test"
re.findall(r'(?:[A-Z][a-z]+)', text, re.M)
>>> ['This', 'Test', 'This', 'Is', 'Another', 'Test']  # All Title Cased words
wcyn
  • 3,826
  • 2
  • 31
  • 25
1

For Python, use the built in function str.istitle().

"John Doe".istitle() # True
"Jane doe".istitle() # False
C Panda
  • 3,297
  • 2
  • 11
  • 11