Convert the string to a string in which the words are separated by spaces and only the first word starts with an uppercase letter

Question

I am trying to make a script that will accept a string as input in which all of the words are run together, but the first character of each word is uppercase. It should convert the string to a string in which the words are separated by spaces and only the first word starts with an uppercase letter.

For Example (The Input):

"StopWhateverYouAreDoingInterestingIDontCare"

The expected output:

"Stop whatever you are doing interesting I dont care"

Here is the one I wrote so far:

string_input = "StopWhateverYouAreDoingInterestingIDontCare"

def organize_string():
   start_sentence = string_input[0] 
   index_of_i = string_input.index("I")
   for i in string_input[1:]: 
      if i == "I" and string_input[index_of_i + 1].isupper(): 
           start_sentence += ' ' + i 
      elif i.isupper():      
           start_sentence += ' ' + i.lower()
      else: 
           start_sentence += i
return start_sentence

While this takes care of some parts, I am struggling with differentiating if the letter "I" is single or a whole word. Here is my output:

"Stop whatever you are doing interesting i dont care"

Single "I" needs to be uppercased, while the "I" in the word "Interesting" should be lowercased "interesting".

I will really appreciate all the help!

Does this answer your question? [Split a string at uppercase letters](https://stackoverflow.com/questions/2277352/split-a-string-at-uppercase-letters). Lowercasing individual words except "I" and joining them together into one string should be trivial. — Georgy, Oct 20 '20 at 11:19

Chris Charley · Accepted Answer · 2020-10-19T00:49:30.197

A regular expression will do in this example.

import re
s = "StopWhateverYouAreDoingInterestingIDontCare"
t = re.sub(r'(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z])', ' ', s)

Explained:

(?<=[a-z])(?=[A-Z]) - a lookbehind for a lowercase letter followed by a lookahead uppercase letter

| - (signifies or)

(?<=[A-Z])(?=[A-Z]) - a lookbehind for a uppercase letter followed by a lookahead uppercase letter

This regex substitutes a space when there is a lowercase letter followed by an uppercase letter, OR, when there is an uppercase letter followed by an uppercase letter.

UPDATE: This doesn't correctly lowercase the words (with the exception of I and the first_word)

UPDATE2: The fix to this is:

import re

s = "StopWhateverYouAreDoingInterestingIDontCare"

first_word, *rest = re.split(r'(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z])', s)

rest = [word.lower() if word != 'I' else word for word in rest]

print(first_word, ' '.join(rest))

Prints:

Stop whatever you are doing interesting I dont care

Update 3: I looked at why your code failed to correctly form the sentence (which I should have done in the first place instead of posting my own solution :-)).

Here is the corrected code with some remarks about the changes.

string_input = "StopWhateverYouAreDoingInterestingIDontCare"

def organize_string():
   start_sentence = string_input[0] 
   #index_of_i = string_input.index("I")
   for i, char in enumerate(string_input[1:], start=1):
      if char == "I" and string_input[i + 1].isupper(): 
           start_sentence += ' ' + char          
      elif char.isupper():      
           start_sentence += ' ' + char.lower()
      else: 
           start_sentence += char
      
   return start_sentence

print(organize_string())

!. I commented out the line index_of_i = string_input.index("I") as it doesn't do what you need (it finds the index of the first capital I and not an I that should stand alone (it finds the index of the I in Interesting instead of the IDont further in the string_input string). It is not a correct statement.

for i, char in enumerate(string_input[1:], 1) enumerate states the index of the letters in the string starting at 1 (since string_input[1:] starts at index 1 so they are in sync). i is the index of a letter in string_input.

I changed the i's to char to make it clearer that char is the character. Other than these changes, the code stands as you wrote it.

Now the program gives the correct output.

I really appreciate the solution and explanation, I am learning a lot of new things every day! Thanks! (Update: it doesn't lowercase the first letters of the words) — harsh_chane, Oct 17 '20 at 19:34

score 1 · Answer 2 · answered Oct 17 '20 at 19:22

string_input = "StopWhateverYouAreDoingInterestingIDontCare"
counter = 1
def organize_string():
    global counter
    start_sentence = string_input[0] 
    for i in string_input[1:]: 
        if i == "I" and string_input[counter+1].isupper():
            start_sentence += ' ' + i
        elif i.isupper():      
            start_sentence += ' ' + i.lower()
        else: 
            start_sentence += i
        counter += 1
    print(start_sentence)

organize_string()

I made some changes to your program. I used a counter to check the index position. I get your expected output:

Stop whatever you are doing interesting I dont care

Oh, I never thought of taking this approach. Really appreciate the help, thanks!! — harsh_chane, Oct 17 '20 at 19:25

mpSchrader · Answer 3 · 2020-10-18T09:36:44.877

Here is one solution utilising the re package to split the string based on the upper case characters. [Docs]

import re
text = "StopWhateverYouAreDoingInterestingIDontCare"

# Split text by upper character
text_splitted = re.split('([A-Z])', text)
print(text_splitted)

As we see in the output below the separator (The upper case character) and the text before and after is kept. This means that the upper case character is always followed by the rest of the word. The empty first string originates from the first upper case character, which is the first separator.

# Output of print
[
    '',
    'S', 'top',
    'W', 'hatever',
    'Y', 'ou', 
    'A', 're', 
    'D', 'oing', 
    'I', 'nteresting', 
    'I', '', 
    'D', 'ont', 
    'C', 'are'
]

As we have seen the first character is always followed by the rest of the word. By combining the two we have the splitted words. This also allows us to easily handle your special case with the I

# Remove first character because it is always empty if first char is always upper
text_splitted = text_splitted[1:]

result = []
for i in range(0, len(text_splitted), 2):
    word = text_splitted[i]+text_splitted[i+1]

    if (i > 0) and (word != 'I') :
        word = word.lower()

    result.append(word)

result = ' '.join(result)

score 1 · Answer 4 · answered Oct 17 '20 at 20:23

s = 'StopWhateverYouAreDoingInterestingIDontCare'
 
ss = ' '

res = ''.join(ss + x if x.isupper() else x for x in s).strip(ss).split(ss)

sr = ''

for w in res:

  sr = sr + w.lower() + ' '
  
print(sr[0].upper() + sr[1:])

output

Stop whatever you are doing interesting i dont care

score 1 · Answer 5 · answered Oct 17 '20 at 21:23

I hope this will work fine :-
string_input = "StopWhateverYouAreDoingInterestingIDontCare"
def organize_string():
    i=0
    while i<len(string_input):
        if string_input[i]==string_input[i].upper() and i==0 :
            print(' ',end='')
            print(string_input[i].upper(),end='')
        elif string_input[i]==string_input[i].upper() and string_input[i+1]==string_input[i+1].upper():
            print(' ',end='')
            print(string_input[i].upper(),end='')
        elif string_input[i]==string_input[i].upper() and i!=0:
            print(' ',end='')
            print(string_input[i].lower(),end='')
        if string_input[i]!=string_input[i].upper():
            print(string_input[i],end='')
        i=i+1
organize_string()

score 0 · Answer 6 · answered Oct 17 '20 at 19:14

0

split the sentence into individual words. If you find the word "I" in this list, leave it alone. Leave the first word alone. All of the other words, you cast to lower case.

answered Oct 17 '20 at 19:14

Prune

76,765
14
60
81

Do you mean split the sentence after it returns the output? – harsh_chane Oct 17 '20 at 19:26
Yes, after you've inserted the spaces. Better yet, do the splitting *instead* of inserting the spaces. Can you do the coding from there? – Prune Oct 17 '20 at 20:43

score 0 · Answer 7 · answered Oct 17 '20 at 19:16

0

You have to use some string manipulation like this:

output=string_input[0]
for l in string_input[1:]:
     if l.islower():
             new_s+=l
     else:
             new_s+=' '+l.lower()
print(output)

answered Oct 17 '20 at 19:16

flabons

331
1
8

Convert the string to a string in which the words are separated by spaces and only the first word starts with an uppercase letter

7 Answers7