Finding duplicates in an array using C

Question

I came across a problem in this link https://leetcode.com/problems/contains-duplicate/. There is an input array of integers. Find out if there any repeating integers, then return true else false.

How can I optimize this code?
Can I improve the logic further?

In my code below,there is an if condition if(hPtr && hPtr->key == *(nums+i)). I am using the array elements as keys. If so, each key can't be unique if the same element is repeated twice right? So can I modify the if condition as if(hPtr->key == *(nums + i))?

If any other mistakes, please feel free to point out.

There is already a hashtable library available in C http://troydhanson.github.io/uthash/userguide.html and wrote the below code.

struct hash {
        int key;
        int value;
        UT_hash_handle hh;
};

    struct hash *hashtable = NULL;

    void addToHash(int key, int value)
    {
      struct hash *map;
      //I am using the array elements as hash keys
      HASH_FIND_INT(hashtable, &key, map);

      if(map == NULL)
      {
        map = (struct hash*)malloc(sizeof(struct hash));
        map->key = key;
        HASH_ADD_INT(hashtable, key, map);
      }     
      map->value = value;
    }   

    struct hash *findInHash(int key)
    {
        struct hash *h;
        HASH_FIND_INT(hashtable, &key, h);
        return h;
    }

    bool containsDuplicate(int* nums, int numsSize) {
        struct hash *hPtr;
        int target = 0;
        hashtable = NULL;
        if((numsSize <= 1) || (nums == 0)) return false;

        int i, index1 = 0;   

        for(i = 0; i < numsSize; i++)
        {
            /*The below statement will look if the key is already present in 
              the hashtable*/
            hPtr = findInHash(*(nums + i) - target);
            /*If the key is found already, then it look for the value of that 
            key. If the value and the current array element is same, then a 
            duplicate exist*/
            if(hPtr && hPtr->key == *(nums+i))
               return true;
            addToHash(*(nums + i), i);
        }
        struct hash *temp;
        HASH_ITER(hh, hashtable, hPtr, temp) {free(hPtr);}
        return false;
    }

If you want to solve this in C, that's fine, but it's imperative you have some kind of unit testing in place or you'll be stumbling around without any form of code validation. Unit tests done right will very quickly verify your code is correct and will save you a ton of time guessing, wondering, and debugging. — tadman, May 31 '19 at 21:34
As for performance, make it easy to benchmark this code, even using `time`, with runs of varying length. — tadman, May 31 '19 at 21:35
sorry to ask this, but I haven't done any unit testing. Any pointers to start with? — hago, May 31 '19 at 21:35
Search for "C unit testing framework" and there's a [plethora of options](https://stackoverflow.com/questions/65820/unit-testing-c-code). — tadman, May 31 '19 at 21:41
i wrote a blog post some time ago about unit testing in C, it was with C++ but the framework is safe to work with >C89, no problem with that, i use it. have a peek https://prdeving.wordpress.com/2017/03/28/unit-test-in-c-with-google-test/ — PRDeving, May 31 '19 at 22:25

PRDeving · Accepted Answer · 2019-05-31T22:20:07.913

I think the solution is simpler than what you think:

typedef struct {
  int capacity;
  int len;
  int **keys;
  int *values;
} Map;

My struct has keys as arrays of two integers, one for the identifier and the other is the index of the values array, that's how hash maps works indeed.

void initMap(Map *map) {
  map -> capacity = 5;
  map -> len = 0;
  map -> keys = malloc(map -> capacity * sizeof(int));
  map -> values = malloc(map -> capacity * sizeof(int));
}

Then we have a function to initialize the map, easy...

void add(Map *map, int k, int v) {
  if (map -> len == map -> capacity - 1) resize(map);
  map -> values[map -> len] = v;
  map -> keys[map -> len] = malloc(sizeof(int) * 2);
  map -> keys[map -> len][0] = k;
  map -> keys[map -> len][1] = map -> len;
  map -> len++;
}

a function to put elements in the map

void resize(Map *map) {
  map -> capacity *= 2;
  map -> keys = realloc(map -> keys, map -> capacity * sizeof(int) * 2);
  map -> values = realloc(map -> keys, map -> capacity * sizeof(int));
}

and a function to resize the map if the limit is reached

By decoupling the key index and the index on the values array you can have the keys sorted, making your live much easier. the values will be in their array in the same order as they come but the index will be sorted from 0 to N. To do this i'll use a simple selsort algorithm, it's not the best but the easiest...

void sort(Map *map) {
  int i, j;
  int min, tmp;
  for (i = 0; i < map -> len - 1; i++) {
    min = i;
    for (j = i + 1; j < map -> len; j++) {
      if(map -> keys[j][0] < map -> keys[min][0] ) min = j;
    }

    tmp = map -> keys[min][0];
    map -> keys[min][0] = map -> keys[i][0];
    map -> keys[i][0] = tmp;
  }
}

with this you will have your index shorted. I'd execute it right after adding a new entry to the map, inside the add() function, it's out here now just for testing.

Once you have your index sorted you can write a binary search algorithm, eeeasy. Now you can now if a key is already in the map.

int find_recursive(Map *map, int start, int end, int key) {
   if (end >= start) {
        int mid = start + (end - start) / 2;

        if (map -> keys[mid][0] == key)
            return mid;

        if (map -> keys[mid][0] > key)
            return find_recursive(map, start, mid - 1, key);

        return find_recursive(map, mid + 1, end, key);
    }
    return -1;
}

int find(Map *map, int key) {
    return find_recursive(map, 0, map -> len, key);
}

Sooo

  Map map;
  initMap(&map);

  add(&map, 3, 12);
  add(&map, 12, 1);
  add(&map, 1, 2);

  printf("%d %d %d\n",
      map.keys[0][0],
      map.keys[1][0],
      map.keys[2][0]
      );
  // Should print 3 12 1

  sort(&map);

  printf("%d %d %d\n",
      map.keys[0][0],
      map.keys[1][0],
      map.keys[2][0]
      );
  // Should print 1 3 12

  printf("%d\n", find(&map, 12));
  // Should print 2 (the index, 3rd entry)

I can't do a working example now, i cant compile with my machine, maybe later at home... sorry :(

EDIT: forgot to say... to get the value you have to do map.values[map.keys[find(&map, key)][1]].

Of course this should be a function:

int get(Map *map, key) {
  int keyindex = find(map, key);
  int valueindex = map -> keys[index][1];
  return map -> values[valueindex];
}

Forgot to say that, by decoupling the keys from the values you can use as value any type you want, even whole structures...

enjoy

Finding duplicates in an array using C

1 Answers1