0

I need to write a function which replaces multiple format strings into downcase.

For example, a paragraph contains a word 'something' in different formats like 'Something', 'SomeThing', 'SOMETHING', 'SomeTHing' need to convert all format words into downcase 'something'.

How to write a function with replacing with downcase?

RoadRunner
  • 25,803
  • 6
  • 42
  • 75
prasannaboga
  • 1,004
  • 1
  • 14
  • 36

3 Answers3

2

You can split your paragraph into different words, then use the slugify module to generate a slug of each word, compare it with "something", and if there is a match, replace the word with "something".

In [1]: text = "This paragraph contains Something, SOMETHING, AND SomeTHing"

In [2]: from slugify import slugify

In [3]: for word in text.split(" "): # Split the text using space, and iterate through the words
   ...:     if slugify(unicode(word)) == "something": # Compare the word slug with "something"
   ...:           text = text.replace(word, word.lower())

In [4]: text
Out[4]: 'This paragraph contains something, something AND something'
Ganesh Tata
  • 1,118
  • 8
  • 26
1

Split the text into single words and check whether a word in written in lower case is "something". If yes, then change the case to lower

if word.lower() == "something":
    text = text.replace(word, "something")

To know how to split a text into words, see this question.

Another way is to iterate through single letters and check whether a letter is the first letter of "something":

text = "Many words: SoMeThInG, SOMEthING, someTHing"
for n in range(len(text)-8):
    if text[n:n+9].lower() == "something": # check whether "something" is here
        text = text.replace(text[n:n+9], "something")

print text
Psytho
  • 3,313
  • 2
  • 19
  • 27
1

You can also use re.findall to search and split the paragraph into words and punctuation, and replace all the different cases of "Something" with the lowercase version:

import re

text = "Something, Is: SoMeThInG, SOMEthING, someTHing."

to_replace = "something"

words_punct = re.findall(r"[\w']+|[.,!?;: ]", text)

new_text = "".join(to_replace if x.lower() == to_replace else x for x in words_punct)

print(new_text)

Which outputs:

something, Is: something, something, something.

Note: re.findall requires a hardcoded regular expression to search for contents in a string. Your actual text may contain characters that are not in the regular expression above, you will need to add these as needed.

RoadRunner
  • 25,803
  • 6
  • 42
  • 75