determine if a string has all unique characters?

Question

Can anybody tell me how to implement a program to check a string contains all unique chars ?

score 40 · Accepted Answer · answered Feb 14 '11 at 00:24

40

If you are talking about an ASCII string:

Create an int array [0-255], one for each character index, initialised to zero.
Loop through each character in the string and increment the respective array position for that character
If the array position already contains a 1, then that character has already been encountered. Result => Not unique.
If you reach the end of the string with no occurrence of (3), Result => the string is unique.

answered Feb 14 '11 at 00:24

mrcrowl

1,265
12
14

2

+1 I like this one the best! And in fact I've actually implemented this way, years ago. My memory must be going! – Mitch Wheat Feb 14 '11 at 00:25
3

If you are talking about an *ASCII* string, array with 128 elements will do just fine. (Actually, in the case of C strings 127 elements would be enough, but less convenient.) – eq- Feb 14 '11 at 00:26
And if you are not talking about an ASCII string but a string of UTF-8 Unicode *runes*, then a hash-table approach along the same idea is probably going to be easy and fast to implement. Note also Ira Baxters solution with a Boolean array which has better space usage - for in-between sized data it may be valuable. – I GIVE CRAP ANSWERS Feb 14 '11 at 01:12
6

If you're talking about an ASCII string, welcome to the future! We may not have personal jetpacks yet, but we do have unicode! – Nick Johnson Feb 14 '11 at 02:14
2

This was actually given to me as an interview question. This was my answer, but my interviewer suggested that instead of an int array that it was possible to use an array of bits (or, to make it easier, chars or bytes) to save memory. Its O(1) to search the array if your indices are mapped from your char values! – John Leehey Apr 28 '11 at 22:36
1

What if we don't get a new data structure? – Afshin Moazami Nov 13 '12 at 16:45
2

A good optimisation would be to check that the that the length of the string is less or equal to 256 (Assume extended ASCII is used), if the string is greater then return false immediately. The whole idea here is that you cannot have a string with 300 characters when only 256 unique characters are available. The time complexity for this code is O(n) and the space complexity is O(1) – Edwin Jul 10 '13 at 21:58

Matteo Italia · Answer 2 · 2011-02-14T00:35:32.790

7

Sort the characters in the string using your algorithm of choice (e.g. the builtin qsort function), then scan the string checking for consecutive repeating letters; if you get to the end without finding any, the string contains all unique characters.

An alternative may be using some structure that has one bucket for each character the string may contain, all initialized to zero; you scan the string, incrementing the value of the bucket corresponding to the current character. If you get to increment a bucket that already has a 1 inside it you are sure that your string contains duplicates.

This can work fine with chars and an array (of size UCHAR_MAX+1), but it quickly gets out of hand when you start to deal with wide characters. In such case you would need a hashtable or some other "serious" container.

The best algorithm depends on the length of the strings to examine, the size of each character, the speed of the sorting algorithm and the cost of allocating/using the structure to hold the character frequencies.

edited Feb 14 '11 at 00:35

answered Feb 14 '11 at 00:22

Matteo Italia

123,740
17
206
299

1

This `qsort` idea is neat because it uses an already built-in function. An alternative on the idea is to store elements in a tree (a *set*) and bail when a new character is already a set-member. Thus you avoid the extra work of having to sort the whole array, even if it is the string 'aa..' you are asking on. – I GIVE CRAP ANSWERS Feb 14 '11 at 01:14
why I always get true for using this sort solution? `public static boolean uniqueCharacters(String s){ Arrays.sort(s.toCharArray()); for(int i = 1; i < s.length(); i++){ if(s.charAt(i) == s.charAt(i-1)){ return false; } } return true; }` – Hengameh Aug 24 '15 at 03:00
Even this one is not working: `public static boolean uniqueCharacters(String s){ Arrays.sort(s.toCharArray()); String sorted = new String(s); for(int i = 1; i < sorted.length(); i++){ if(sorted.charAt(i) == sorted.charAt(i-1)){ return false; } } return true; }` could you please have a look, and see what is wrong? Thanks – Hengameh Aug 24 '15 at 03:06
@Hengameh : you are not keeping the char[] from (non-standard) `s.toCharArray()` - it gets lost after sorting. Comparing consecutive chars from a copy of the original `String` is as useless as is operating on the latter. – greybeard Oct 24 '15 at 04:01

score 6 · Answer 3 · answered Feb 24 '11 at 09:28

6

Make a set of the letters, and count the values.

set("adoihgoiaheg") = set(['a', 'e', 'd', 'g', 'i', 'h', 'o']):

def hasUniqueLetters(str):
    return (len(set(str)) == len(str))

>>> hasUniqueLetters("adoihgoiaheg")
False

answered Feb 24 '11 at 09:28

Phil H

19,928
7
68
105

score 6 · Answer 4 · edited Apr 28 '11 at 22:31

6

#include <iostream>
#include <string>
using namespace std;

bool isUnique(string _str)
{
        bool char_set[256];
        int len = _str.length();

        memset(char_set, '\0', 256);
        for(int i = 0; i < len; ++i)
        {
            int val = _str[i]- '0';
            if(char_set[val])
            {
                return false;
            }
            char_set[val] = true;
        }

        return true;
    }

    int main()
    {
        cout<<"Value: "<<isUnique("abcd")<<endl;
        return 0;
    }

edited Apr 28 '11 at 22:31

hippietrail

15,848
18
99
158

answered Apr 28 '11 at 04:36

user673558

61
1
3

@Orbling: Yes, shouldn't it be - 'a' ? – krishnang May 22 '14 at 14:34
@krishnang: The point of using a flag array with a size of 256, is that it represents every possible value of an 8-bit character string. There is therefore no need at all to subtract anything from the value of each character, as all that would achieve here is making negative index values for any character below the subtracted ordinal value, which would be an out-of-bounds index for the array and crash the program. – Orbling Jun 02 '14 at 23:12
@krishnang: In the case of the code above using `'0'`, that would result in all characters below 48 crashing the code (like space and a deal of punctuation). Using `'a'` instead would move that up to 97, making all uppercase characters and digits cause an error. – Orbling Jun 02 '14 at 23:12
1

The question is tagged C. This answer is C++. If you're going to insist on a C++ answer, at least use the functions from ``. – Dec 25 '14 at 21:48

score 3 · Answer 5 · answered Feb 14 '11 at 00:25

3

Use a 256-entry array. Fill it with 0. Now traverse the string setting the corresponding entry in the array to 1 if it's 0. Otherwise, there are repeated chars in the string.

answered Feb 14 '11 at 00:25

lhf

70,581
9
108
149

score 2 · Answer 6 · answered Feb 14 '11 at 00:26

Set an array of booleans of size equal to the character set to false. (Constant time). Scan the string; for each character, inspect the array at the characater's slot; if true, string has duplicate characters. If false, set that slot to true and continue. If you get to the end without encountering a duplicate, there aren't any and the string only contains unique characters. Running time: O(n) when n is the lenght of the string, with a pretty small constant.

score 2 · Answer 7 · answered Mar 24 '11 at 00:52

2

Similarly (and without arrays), use a HASH TABLE!

//psuedo code:

go through each char of the string
hash the char and look it up in the hash table
if the table has the hash, return FALSE // since it's not unique
__else store the hash
return to step #1 until you're done

Run time is O(n) and memory space is better too since you don't need an array of 256 (asciis)

answered Mar 24 '11 at 00:52

Matthew

2,035
4
25
48

Depends on hashtable implementation. Hashtable might initialize to a size of 256. – John Kurlak Sep 19 '12 at 18:08

score 1 · Answer 8 · answered May 14 '13 at 06:50

#include <stdio.h>

#define ARR_SIZE 32

unsigned char charFlag[ARR_SIZE];

void initFlag() {
    int i = 0;

    for (i = 0; i < ARR_SIZE; i++)
        charFlag[i] = 0;

}

int getFlag(int position) {
    int val = 0;
    int flagMask = 1;

    int byteIndex = position / 8;
    int locPos = position % 8;

    flagMask = flagMask << locPos;
//  flagMask = ~flagMask;

    val = charFlag[byteIndex] & flagMask;
    val = !(!val);
//  printf("\nhex: %x\n", val);
    return val;

}

void setFlag(int position) {
    int flagMask = 1;
    int byteIndex = position / 8;
    int locPos = position % 8;

    flagMask = flagMask << locPos;
    charFlag[byteIndex] = charFlag[byteIndex] | flagMask;

}
int isUniq(char *str) {
    int is_uniq = 1;

    do {
        char *lStr = str;
        int strLen = 0;
        int i;

        if (str == 0)
            break;

        while (*lStr != 0) {
            lStr++;
            strLen++;
        }

        initFlag();
        lStr = str;
        for (i = 0; i < strLen; i++) {
            if (getFlag(lStr[i]))
                break;

            setFlag(lStr[i]);
        }

        if (i != strLen)
            is_uniq = 0;

    } while (0);

    return is_uniq;
}

int main() {

    char *p = "abcdefe";
    printf("Uniq: %d\n", isUniq(p));
    return 0;
}

score 1 · Answer 9 · answered Sep 04 '13 at 00:29

1

Use a HashTable, add the key for each character along with the count of occurrences as the value. Loop through the HashTable keys to see if you encountered a count > 1. If so, output false.

answered Sep 04 '13 at 00:29

user2744865

11
1

a hash table is too expensive for this purpose. A simple table would be fine, unless you want to work with arbitrary Unicode strings – phuclv Apr 28 '19 at 00:28

score 1 · Answer 10 · answered Dec 29 '16 at 09:01

1

Simple solution will be using 2 loops. No additional data structure is needed to keep a track on characters.

bool has_unique_char(char *str,int n)
{
      if(n==0)
           return true;

      for(int i=1;i<n;i++){
            for(int j=0;j<i;j++){
                    if(str[i] == str[j])
                          return false;
            }      
      }
      return true;
}

answered Dec 29 '16 at 09:01

ANK

537
7
12

but this is O(n²) instead of O(n) like other answers – phuclv Apr 28 '19 at 00:27

score 0 · Answer 11 · edited Mar 16 '14 at 10:56

0

bool isUnique(char st[],int size)
{
    bool char_set[256]=false;
    for(int i=0;i<size;i++)
    {
        if(char_set[st[i]]-'0')
        return false;
        char_set[st[i]-'0')=true;
    }
    return true;
}

edited Mar 16 '14 at 10:56

László Papp

51,870
39
111
135

answered Mar 16 '14 at 10:37

Sumit Gaur

1
4

score 0 · Answer 12 · answered Apr 29 '14 at 04:18

my original answer was also doing the similar array technique and count the character occurrence.

but i was doing it in C and I think it can be simple using some pointer manipulation and get rid of the array totally

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

void main (int argc, char *argv[])
{
    char *string;
    if (argc<2)
    {
        printf ("please specify a string parameter.\n");
        exit (0);       
    }

    string = argv[1];
    int i;

    int is_unique = 1;
    char *to_check;
    while (*string)
    {
        to_check = string+1;
        while (*to_check)
        {
            //printf ("s = %c, c = %c\n", *string, *to_check);
            if (*to_check == *string)
            {
                is_unique = 0;
                break;
            }
            to_check++;
        }
        string++;
    }

    if (is_unique)
        printf ("string is unique\n");
    else
        printf ("string is NOT unique\n");
}

so this is O(n²) which is worse than most answers here which is O(n), and it's less readable than other O(n²) answers — phuclv, Apr 28 '19 at 00:30

score 0 · Answer 13 · answered Dec 25 '14 at 21:10

0

Without using additional memory:

#define UNIQUE_ARRAY 1
int isUniqueArray(char* string){
    if(NULL == string ) return ! UNIQUE_ARRAY;
    char* current = string;
    while(*current){
        char* next   = current+1;
        while(*next){
            if(*next == *current){
                return ! UNIQUE_ARRAY;
            }
            next++;
        }
        current++;
    }
    return UNIQUE_ARRAY;
}

answered Dec 25 '14 at 21:10

Madhu S. Kapoor

345
1
5
11

This seems to be a partial answer only: the question being `Can anybody tell […] ?`, you didn't. – greybeard Dec 25 '14 at 22:20
this is using pointers exactly the same as ledmirage's answer, and both are O(n²) – phuclv Apr 28 '19 at 00:31

score 0 · Answer 14 · answered Oct 23 '15 at 20:20

I beleive there is a much simpler way:

int check_string_unique(char *str) 
{
   int i = 0;
   int a = 0;
   while (str[i])
   {
      a = i + 1; // peak to the next character
      while (str[a])
      {
          if (str[i] == str[a]) // you found a match
             return (0); // false
          a++; // if you've checked a character before, there's no need to start at the beggining of the string each time. You only have to check with what is left.
      }
   i++; //check each character.
   }
return (1); //true!
}

this is O(n²) which is a lot worse than other solutions – phuclv Mar 23 '19 at 16:47 — phuclv, Mar 23 '19 at 16:47

score 0 · Answer 15 · answered Nov 13 '16 at 15:59

I hope this can help you

#include <iostream>
using namespace std;
int main() {
 string s;
 cin>>s;
 int a[256]={0};
 int sum=0;
 for (int i = 0; i < s.length();i++){
    if(a[s[i]]==0)++sum;
    a[s[i]]+=1;
 }
 cout<<(sum==s.length()?"yes":"no");
 return 0;

}

score 0 · Answer 16 · edited Jun 20 '20 at 09:12

this is optimal solution for the problem. it takes only an integer variable and can tell whether it is unique or not regardless of string size.

complexity

best case O(1)

worst case O(n)

public static boolean isUniqueChars(String str) {
    int checker = 0;
    for (int i = 0; i < str.length(); ++i) {
        int val = str.charAt(i) - ‘a’;
        if ((checker & (1 << val)) > 0) 
            return false;
        checker |= (1 << val);
    }
    return true;
}

determine if a string has all unique characters?

16 Answers16

complexity

Linked