3

How do I remove repeated characters in string and just leave one of them.

e.g:-

"Bertuggggg Mete" 

to

"Bertug Mete"

I've just read data like this:

dataFrame = pd.read_excel("C:\\Users\\Bertug\\Desktop\\example.xlsx")

Name 0 Bertuggggg Mete

Input is read from .xlsx file. I have tried split and strip functions but they don't work seem to work as expected.

How I can solve this problem ?

Devi Prasad Khatua
  • 1,185
  • 3
  • 11
  • 23
Bertug
  • 915
  • 2
  • 10
  • 26
  • Have look here: http://stackoverflow.com/questions/18799036/python-best-way-to-remove-duplicate-character-from-string – Gurupad Hegde Mar 30 '17 at 06:39
  • 1
    Check this post and see if it helps: http://stackoverflow.com/questions/9841303/removing-duplicate-characters-from-a-string – hasanzuav Mar 30 '17 at 06:40
  • I've looked that but it contains just two characters. My question is for more than two – Bertug Mar 30 '17 at 06:40
  • 1
    Possible duplicate of [Python: Best Way to remove duplicate character from string](http://stackoverflow.com/questions/18799036/python-best-way-to-remove-duplicate-character-from-string) – Gurupad Hegde Mar 30 '17 at 06:41
  • @Bertug, you can use idea from stackoverflow.com/questions/18799036/ . Also, from stackoverflow.com/questions/9841303 : If you look at the regex in the solution, you will get the answer. Hint: You need to use `\1` instead of `\1\1` – Gurupad Hegde Mar 30 '17 at 06:45

2 Answers2

4

Check this out:

Replace column_name with whatever is the column name you want to apply the replacement.

min_threshold_rep = 2
column_name = 'Name'
dataframe[column_name]= dataframe[column_name].str.replace(r'(\w)\1{%d,}'%(min_threshold_rep-1), r'\1')

NOTE: this would replace every min_threshold_rep number of consecutive character with one character.

Devi Prasad Khatua
  • 1,185
  • 3
  • 11
  • 23
0

python code :

if __name__ == '__main__':
    s = 'Bertuggggg Mete'
    if len(s) == 0:
        print('wrong!')
        exit()
    r = s[0]
    for c in s:
        if r[len(r) - 1] != c:
            r += c
    print(r)

java code :

public class Test {

public static void main(String[] args) {
    String s = "Bertuggggg Mete";
    StringBuffer sb = new StringBuffer();
    for (int i = 0, j = s.length(); i < j; i++) {
        if (i == 0) {
            sb.append(s.charAt(0));
        }
        if (s.charAt(i) != sb.charAt(sb.length() - 1)) {
            sb.append(s.charAt(i));
        }
    }
    System.out.println(sb);
}

}
chile
  • 141
  • 1
  • 12