How can you remove duplicate characters in a string?

Question

I have to implements a function that takes a string as an input and finds the non-duplicate character from this string.

So an an example is if I pass string str = "DHCD" it will return "DHC" or str2 = "KLKLHHMO" it will return "KLHMO"

Could you post what you have tried so far? – SquareCog Feb 26 '09 at 01:47 — SquareCog, Feb 26 '09 at 01:47

score 36 · Accepted Answer · answered Feb 26 '09 at 02:14

36

A Linq approach:

public static string RemoveDuplicates(string input)
{
    return new string(input.ToCharArray().Distinct().ToArray());
}

answered Feb 26 '09 at 02:14

Christian C. Salvadó

807,428
183
922
838

4

I don't think you to need to cast to char array here? – Pat Mar 31 '18 at 17:32

score 7 · Answer 2 · answered Feb 26 '09 at 01:47

7

It will do the job

string removedupes(string s)
{
    string newString = string.Empty;
    List<char> found = new List<char>();
    foreach(char c in s)
    {
       if(found.Contains(c))
          continue;

       newString+=c.ToString();
       found.Add(c);
    }
    return newString;
}

I should note this is criminally inefficient.

I think I was delirious on first revision.

answered Feb 26 '09 at 01:47

Quintin Robinson

81,193
14
123
132

1

am I guessing correctly that you intentionally left the inefficiencies as an exercise to the reader, or do you want suggestions on making this work faster? – SquareCog Feb 26 '09 at 01:49
Indeed you are correct, if it is homework then the OP can filter through and create one that isn't terrible. It also serves as a baseline for understanding what is happening. I don't need suggestions on improvements, thanks though. – Quintin Robinson Feb 26 '09 at 01:56

Sparr · Answer 3 · 2009-02-26T22:20:15.340

6

For arbitrary length strings of byte-sized characters (not for wide characters or other encodings), I would use a lookup table, one bit per character (32 bytes for a 256-bit table). Loop through your string, only output characters that don't have their bits turned on, then turn the bit on for that character.

string removedupes(string s)
{
    string t;
    byte[] found = new byte[256];
    foreach(char c in s)
    {
        if(!found[c]) {
            t.Append(c);
            found[c]=1;
        }
    }
    return t;
}

I am not good with C#, so I don't know the right way to use a bitfield instead of a byte array.

If you know that your strings are going to be very short, then other approaches would offer better memory usage and/or speed.

edited Feb 26 '09 at 22:20

answered Feb 26 '09 at 01:48

Sparr

7,489
31
48

I think this will be significantly faster than Quintin Robinson's approach, but will use significantly more memory for short strings. – Sparr Feb 26 '09 at 01:54
But significantly less memory for medium or long strings, if a bit array is used. – Sparr Feb 26 '09 at 03:03
Your heart is in the right place, but your logic is a bit off. It should be if(found[c]){t+=c; found[c] = 1;} No else block needed. Your current code won't do the trick. – BFree Feb 26 '09 at 03:22
Cannot implicitly convert type 'byte' to 'bool' on `if (found[c])`? – GONeale Mar 02 '13 at 03:31
@GONeale I'm not sure the right way to do that conversion in C#, try if(!!found[c]) – Sparr Mar 03 '13 at 05:38

score 4 · Answer 4 · edited May 08 '19 at 07:54

4

    void removeDuplicate()
    {
      string value1 = RemoveDuplicateChars("Devarajan");
    }

     static string RemoveDuplicateChars(string key)
    {
        string result = "";          
        foreach (char value in key)
            if (result.IndexOf(value) == -1)                   
                result += value;
        return result;
    }

edited May 08 '19 at 07:54

Amardeep Kumar Agrawal

440
3
10

answered Jul 06 '12 at 06:37

Devarajan.T

41
1

1

You don't need the following lines 1) string result = ""; and 2) result += value; Returning table would suffice. – rajibdotnet May 19 '14 at 21:10

score 3 · Answer 5 · answered Feb 26 '09 at 01:49

It sounds like homework to me, so I'm just going to describe at a high level.

Loop over the string, examining each character
Check if you've seen the character before
- if you have, remove it from the string
- if you haven't, note that you've now seen that character

yantaq · Answer 6 · 2021-10-11T19:03:05.983

this is in C#. validation left out for brevity. primitive solution for removing duplicate chars from a given string

    public static char[] RemoveDup(string s)
    {
        char[] chars = new char[s.Length];
        int unique = 0;
        chars[unique] = s[0];  // Assume: First char is unique
        for (int i = 1; i < s.Length; i++)
        {
            // add char in i index to unique array 
            // if char in i-1 != i index
            // i.e s = "ab" -> a != b
            if (s[i-1] != s[i]
            chars[++unique] = s[i];
        }
        return chars;
    }

score 1 · Answer 7 · answered Dec 21 '12 at 20:29

you may use HashSet:

 static void Main()
    {
        string textWithDuplicates = "aaabbcccggg";

        Console.WriteLine(textWithDuplicates.Count());  
        var letters = new HashSet<char>(textWithDuplicates);
        Console.WriteLine(letters.Count());

        foreach (char c in letters) Console.Write(c);   
    }

score 1 · Answer 8 · 2017-08-04T19:15:44.630

 class Program
    {
        static void Main(string[] args)
        {
            bool[] doesExists = new bool[256];
            String st = Console.ReadLine();
            StringBuilder sb = new StringBuilder();
            foreach (char ch in st)
            {
                if (!doesExists[ch])
                {
                    sb.Append(ch);
                    doesExists[ch] = true;
                }
            }
            Console.WriteLine(sb.ToString());
        }
    }

score 1 · Answer 9 · answered Mar 15 '11 at 18:03

My answer in java language.
Posting here so that you might get a idea even it is in Java language.Algorithm would remain same.

public String removeDup(String s)
  {
    if(s==null) return null;
    int l = s.length();
    //if length is less than 2 return string
    if(l<2)return s;
    char arr[] = s.toCharArray();

    for(int i=0;i<l;i++)
    {
      int j =i+1; //index to check with ith index
      int t = i+1; //index of first repetative char.

      while(j<l)
      {
        if(arr[j]==arr[i])
        {
          j++;

        }
        else
        {
          arr[t]=arr[j];
          t++;
          j++;
        }

      }
      l=t;
    }

    return new String(arr,0,l);
  }

score 1 · Answer 10 · edited Apr 06 '20 at 17:17

1

Revised version of the first answer i.e: You don't need ToCharArray() function for this to work.

public static string RemoveDuplicates(string input)
{
    return new string(input.Distinct().ToArray());
}

edited Apr 06 '20 at 17:17

Rehan Ali Khan

527
10
23

answered May 09 '11 at 22:29

V M Rakesh

324
3
3

score 0 · Answer 11 · edited Jun 11 '14 at 19:28

0

// Remove both upper-lower duplicates

public static string RemoveDuplicates(string key)
    {
        string Result = string.Empty;
        foreach (char a in key)
        {
            if (Result.Contains(a.ToString().ToUpper()) || Result.Contains(a.ToString().ToLower()))
                continue;
            Result += a.ToString();
        }
        return Result;
    }

edited Jun 11 '14 at 19:28

Jason Aller

3,541
28
38
38

answered Jun 11 '14 at 19:10

Akshay

1

score 0 · Answer 12 · answered Jul 24 '10 at 17:12

char *remove_duplicates(char *str) { char *str1, *str2;

if(!str)
    return str;

str1 = str2 = str;

while(*str2)            
{   
    if(strchr(str, *str2)<str2)
    {
        str2++;
        continue;
    }

    *str1++ = *str2++;      
}
*str1 = '\0';

return  str;

}

score 0 · Answer 13 · answered Oct 24 '10 at 15:54

char* removeDups(const char* str)
{
 char* new_str = (char*)malloc(256*sizeof(char));
 int i,j,current_pos = 0,len_of_new_str;
 new_str[0]='\0';

 for(i=0;i<strlen(str);i++)
{
 len_of_new_str = strlen(new_str);
for(j=0;j<len_of_new_str && new_str[j]!=str[i];j++)
   ;
  if(j==len_of_new_str)
   {
     new_str[len_of_new_str] = str[i];
     new_str[len_of_new_str+1] = '\0';
   }
}
  return new_str;
}

Hope this helps

score 0 · Answer 14 · edited Feb 08 '17 at 14:35

0

var input1 = Console.ReadLine().ToLower().ToCharArray();
var input2 = input1;
var WithoutDuplicate = input1.Union(input2);

edited Feb 08 '17 at 14:35

Fruchtzwerg

10,999
12
40
49

answered Feb 08 '17 at 12:42

user2481149

11

Although this code might solve the problem, a good answer should always contain an explanation. – BDL Feb 08 '17 at 13:42
Agreed. I just posted another way to achieve the result. Ofcourse "Distinct() " from the same library addresses the cause. – user2481149 Feb 10 '17 at 04:00

score 0 · Answer 15 · edited Jan 30 '18 at 07:35

0

Console.WriteLine("Enter String");

string str = Console.ReadLine();

string result = "";
result += str[0]; // first character of string

for (int i = 1; i < str.Length; i++)
{
    if (str[i - 1] != str[i])
        result += str[i];
}

Console.WriteLine(result);

edited Jan 30 '18 at 07:35

Jonas W

3,200
1
31
44

answered Jan 30 '18 at 07:29

Sujeet Kumar

1
2

score 0 · Answer 16 · answered Jan 31 '18 at 13:11

I like Quintin Robinson answer, only there should be some improvements like removing List, because it is not necessarry in this case. Also, in my opinion Uppercase char ("K") and lowercase char ("k") is the same thing, so they should be counted as one.

So here is how I would do it:

private static string RemoveDuplicates(string textEntered)
    {

        string newString = string.Empty;

        foreach (var c in textEntered)
        {
            if (newString.Contains(char.ToLower(c)) || newString.Contains(char.ToUpper(c)))
            {
                continue;
            }
            newString += c.ToString();
        }
        return newString;
    }

This isn't an improvement. `Contains` has to scan all characters. String manipulation creates *new* temporary strings. This results in at n^2 scans and quite a *lot* of temporary strings. The top-voted answer with LINQ and Distinct is actually faster (only scans once) and consumes less memory — Panagiotis Kanavos, Jan 31 '18 at 13:20

score 0 · Answer 17 · answered Feb 07 '18 at 10:52

0

Not sure how optimal it is:

public static string RemoveDuplicates(string input)
{
    var output = string.Join("", input.ToHashSet());
    return output;
}

answered Feb 07 '18 at 10:52

duder

1

score 0 · Answer 18 · answered Feb 09 '18 at 17:37

Below is the code to remove duplicate chars from a string

        var input = "SaaSingeshe";
        var filteredString = new StringBuilder();
        foreach(char c in input)
        {
            if(filteredString.ToString().IndexOf(c)==-1)
            {
                filteredString.Append(c);
            }
        }
        Console.WriteLine(filteredString);
        Console.ReadKey();

score 0 · Answer 19 · answered Sep 05 '19 at 08:22

namespace Demo { class Program {

  static void Main(string[] args) {
     string myStr = "kkllmmnnouo";
     Console.WriteLine("Initial String: "+myStr);
    // var unique = new HashSet<char>(myStr);
     HashSet<char> unique = new HashSet<char>(myStr);
     Console.Write("New String after removing duplicates: ");

     foreach (char c in unique) 
        Console.Write(c);   
  }    } }

score 0 · Answer 20 · answered Mar 25 '21 at 04:30

0

this works for me

private string removeDuplicateChars(String value)
{
    return new string(value.Distinct().ToArray());
}

answered Mar 25 '21 at 04:30

SSP

85
7

score 0 · Answer 21 · answered Feb 20 '12 at 04:46

0

String str="AABBCANCDE";  
String newStr=""; 
for( int i=0; i<str.length(); i++)
{
 if(!newStr.contains(str.charAt(i)+""))
 newStr= newStr+str.charAt(i);
 }
 System.out.println(newStr);

answered Feb 20 '12 at 04:46

Madan

141
2
10

How can you remove duplicate characters in a string?

21 Answers21

Linked