0

Below fails because strings are immutable. I am trying to remove duplicates in place. Is there a better way to do this?

def remove_duplicates_2(word):
  write_index = 0

  for i in range(len(word)):
    found = False

    for j in range(0, write_index):
      if word[i] == word[j]:
        found = True
        break

    if found == False:
      word[write_index] = word[i]
      write_index += 1

    return word
Zacharia Mwangi
  • 53
  • 3
  • 10

3 Answers3

1

No, not in Python. Every time you do something that seems like you "modify" a string, you're actually creating a new, immutable string.

Depending on the reasons for or intent of the "in place" requirement, you may be able to convert the string to another type (list of characters, StringIO, etc.), modify that, then convert it back to a string.

Honestly, this sounds like an interview question. Python wouldn't be an appropriate language to use for the answer due to this constraint. The interviewer would likely be expecting you to use C or C++.

Ouroborus
  • 16,237
  • 4
  • 39
  • 62
0

There is nothing, you can do to remove duplicates from a string in place. It is because strings are immutable in Python. To do so, you'll have to do something as follows:

  1. Create a list of alphabets from the string.
  2. Create a stack.
  3. If the stack is empty, push the current element of the list into the stack.
  4. If the topmost element of the stack is the same as the current element of the list, then don't push it, just move to the next element in the list.
  5. If it is not the same as the topmost element, then push it.
  6. After having done all the push operations (once iterated over all the list elements), reverse the contents of the stack and join its contents to form a new string.
  7. Now the resulting string doesn't contain any duplicates.
taurus05
  • 2,491
  • 15
  • 28
0

If you want to remove all but the first occurrence of each letter in a word,

e.g millennium -> milenu

you can do something like the below:

def remove_duplicate_letters(word):
    result = ""
    for c in word:
        if c not in result:
            result += c
    return result

As you correctly note this is not 'in place'. The string object that variable 'word' points to is unchanged. It can not be changed because string objects are immutable in Python. Instead a new string object is created. In fact many of them. One for the empty string. Then again when we add the first letter, 'm'. And again when we add 'i' and so on.

Be careful. Again strings are immutable. While it looks like we change 'result', what is going on in memory is we create a new string object each loop and change where the variable 'result' points.

blueenvelope
  • 699
  • 7
  • 15