1

Condition is: 's' is removed at the end of every word if it is not in the middle of the sentence.

The input string is:

Ses Holmes os. Sos

The output should be:

Se Holmes o. So

I started with this condition

([A-Z][a-z]+)

but got stuck on it. It cannot be inserted into negative lookbehind.

  • 1
    You don't really need regex here (although it *can* be achieved using it). – Maroun Feb 04 '16 at 07:28
  • Can you explain why Holmes ends in an 's' in your example? – chadrik Feb 04 '16 at 07:31
  • 1
    I think he wants to remove a terminal `s` from all words except uppercase words (names?) that occur after the first word in a sentence. Which leads to the questions "What if the first word in a sentence *is* a name?" and "How can you tell where a sentence begins?" (looking for punctuation is not going to work, Dr. Watson...) – Tim Pietzcker Feb 04 '16 at 07:34
  • 1
    This is impossible without a very clear definition of a) a word and b) the start and end of a sentence. – timgeb Feb 04 '16 at 08:00

1 Answers1

0

The regular expression already looks good, although it doesn’t catch words like café.

To do the replacemnt, you should call re.sub with a function, as explained in Python replace string pattern with output of function. In that function you can implement the exceptions to the rule, so that you express them as Python code, not as regular expression.

Community
  • 1
  • 1
Roland Illig
  • 40,703
  • 10
  • 88
  • 121
  • That won't work - the exceptions are based on the context of the match, and that's not present when you've passed the match to the function. – Tim Pietzcker Feb 04 '16 at 08:36