12

My question is related to this earlier question

Find the first un-repeated character in a string.

In one of my interviews I was asked to write a function to determine the first unique character in a string in time O(n) using as extra space only a boolean array of length n. That is, find the first non repeating letter in a string using only O(n) complexity and a bool array of length n. Can some suggest how to solve it with bool array?

Community
  • 1
  • 1
learner
  • 606
  • 2
  • 8
  • 17
  • 3
    Do we know anything about the size of the alphabet the strings range over? – phs Jan 19 '12 at 23:00
  • 1
    Are we allowed to mutate the input string? – phs Jan 19 '12 at 23:33
  • 1
    I have a gut feeling this is not possible using the provided constraints. If solved it might be the best [minimal perfect hashing algorithm](http://en.wikipedia.org/wiki/Perfect_hash_function#Minimal_perfect_hash_function) known to man. – calebds Jan 20 '12 at 01:02
  • 1
    I agree with paislee, I don't think this is possible, unless you're allowed to use a boolean array that is at least twice the size of the alphabet. – Mansoor Siddiqui Jan 20 '12 at 01:25
  • 3
    Either the interviewer or interviewee is confused. O(n) is asymptotic notation; it refers to the limit as n becomes arbitrarily large. In that limit, n is arbitrarily larger than the size of the character set, in which case the question is trivial. – Nemo Jan 20 '12 at 01:51
  • @Nemo, O(n) may apply only to the asymptote, but the restriction to a boolean array of size n must apply for all n, which makes this quite difficult for small n. What's your trivial solution? – Aaron McDaid Jan 20 '12 at 02:01
  • @Aaron: Ah, OK. In that case I agree it's (probably) impossible. – Nemo Jan 20 '12 at 02:31
  • If we didn't care about time, we could sort the data in place and find non-repeated characters. But I don't know how to find which one was first. – Aaron McDaid Jan 20 '12 at 02:41
  • you can assume alphabets to be from a to z – learner Jan 20 '12 at 09:14
  • yes we can mutate the input string – learner Jan 20 '12 at 09:16

6 Answers6

10

Well, if the size of the character set is constant... Say, for instance, 0-255...

Scan the string looking for character 0.

Then scan the string looking for character 1.

Then scan the string looking for character 2.

...

Finally, scan the string looking for character 255.

This takes 256*n operations which is O(n).

During each scan, if the character is unique, mark its position in the bitmap. At the end return the character at the first marked position. (Or just record the first unique character and its location using k + log(n) bits. Use a hard-coded lookup table or whatever for very small n; otherwise, n bits is generous.)

Although somehow I suspect this is not what the interviewer had in mind.

Nemo
  • 70,042
  • 10
  • 116
  • 153
  • Since the letter range was a to z only, I think this answer might would have worked.. thnks – learner Jan 20 '12 at 09:25
  • 2
    Hehe, another `interview-question` about O(1) space and O(n) time complexity for finding repeated/non-repeated/even-frequent item in an array. Time for a wiki-post? – J0HN Jan 20 '12 at 10:12
  • Why do you suspect the interviewer didn't have this in mind? Just curious – 0xc0de Apr 13 '12 at 18:21
  • 3
    @0xc0de: I guess I was assuming there is a more interesting/clever answer that I was missing. If this is the best answer, it seems more like a trick question (and abuse of the big-O concept) than a meaningful interview puzzle, IMO. – Nemo Apr 13 '12 at 21:52
  • 1
    256 x n which is O(k\*n) which could sometimes be worse than O(n\*n) or O(n^2), this is such a tricky question. – Mehdi Karamosly Feb 13 '16 at 23:18
0
    private static void FirstNonRepeatingCharacters()
    {
        string s = "abbazccdde";
        var result = from item in s
                     group item by item into groupedItems
                     where groupedItems.Count() == 1
                     select groupedItems.Key;
        Console.WriteLine(result.First());                    
    }

C# implementation

  • 3
    That's an algorithmic question, you fail if you use any language-specific feature. – Maël Nison Oct 14 '12 at 05:03
  • This implementation works, yet it fails because it doens't fit in the requirements: "in time O(n) using as extra space only a boolean array of length n", it doesn't fit because you use IGrouping and IGrouping is not a "boolean array of length n". – Theraot Oct 14 '12 at 05:09
0

For Single Pass Solution

Maintain 2 data structures:

  1. Array / bitmap / hashtable for tracking the count of each character
  2. LinkedHashMap to track the characters which have occured only once so far.

LinkedHashMap has

  • O(1) insertion
  • O(1) deletion
  • O(1) search

    class Node
    {
      char data;
      Node next;
      Node prev;
    };
    
    class LinkedHashMap
    {
            // This will keep the insertion order intact
            Node listHead;
            Node currentTail = listHead;
            HashTable<char, Node> charExistsMap;
    
        void Add(char ch) 
        {
            if(!charExistsMap.ContainsKey(ch)) 
            {
                // Add to both hashtable and linkedlist
                Node n = new Node(ch);
                n->next = null;
                n->prev = curentTail; // Added To List
                currentTail = n;
                charExistMap.Add(ch, n);
            }
            else 
            {
                // Remove from both hashtable and linkedlist
                Node n = charExistMap.Remove(ch);
                if(n->prev != null) 
                {
                    n->prev->next = n->next
                    listHead = n->next; // update head
                }
                if(n->next != null)
                    n->next->prev = n->prev;
             }
        }
    
        char GetFirstNonRepeatingChar()
        {
            return listHead->data;
        }
    

    }

After the original string is traversed, the head of the LinkedHashMap will contain the first character which is not repeated.

Vishal Naidu
  • 341
  • 2
  • 6
0
public class FirstUniqueChar {

  public static void main(String[] args) {

    String test = "ABxacd";

    test = test.toUpperCase();
    for (int i = 0; i < test.length(); i++) {
        int firstIndex = test.indexOf(test.charAt(i));
        int lastIndex = test.lastIndexOf(test.charAt(i));
        if (firstIndex == lastIndex) {
            System.out.println("First unique char of String " + test.charAt(i));
            break;
        }

    }

  }
}
falsarella
  • 12,217
  • 9
  • 69
  • 115
  • 1
    This is O(n^2), not O(n): you have O(n) iterations, and each iteration invokes O(n) operations indexOf and lastIndexOf. – Michael Mar 18 '15 at 22:55
0

Do that in two passes:

1st pass: create bool array of size 256 and for each char in the text mark the element of index int(that char). This takes O(n).

2nd pass: for each char in the text check if the corresponding array element is marked. If not then you found your answer. This also takes O(n).

Michael
  • 5,775
  • 2
  • 34
  • 53
-2

Have two boolean arrays, seenOnce and seenMany. Loop over the string populate the arrays. Loop over the string again checking if the character is in seenFirst but not in seenMany. If it is thats your first non-repeating character.

Here's some example code in python.

subject = "ttojxxlma"

seenOnce = [False for i in range(256)]
seenMany = [False for i in range(256)]

for c in subject:
    index = ord(c)
    if seenOnce[index] == False:
        seenOnce[index] = True
    else:
        seenMany[index] = True

for c in subject:
    index = ord(c)
    if seenOnce[index]==True and seenMany[index] != True:
        print(c)
        break

Ok that still uses too boolean array (or python lists =P). To use only one array you can have one array thats double the amount of characters. Instead of accessing the second array double the index and access the big one. But thats just messy.

Not sure if it can be done with less space.

Stephen
  • 4,228
  • 4
  • 29
  • 40
  • 1
    I think the question specifically asks to do this with just one boolean array of length n, not two boolean arrays of length n. – templatetypedef Jan 19 '12 at 22:25
  • 1
    Additionally, I'm not sure how you'd populate these arrays from the original string. Can you elaborate on this? – templatetypedef Jan 19 '12 at 22:33
  • 1
    Can you elaborate how to populate the arrays? this is not a linked list, so populating it may take o(n) for each addition – learner Jan 20 '12 at 09:20