0

I have been given a task to write my own implementation for removing duplicate objects from an array. The array is unsorted.

As an example, I have this array of objects

ItemsList[] objects = {
                new ItemsList("ob1"),
                new ItemsList("ob2"),
                new ItemsList("ob2"),
                new ItemsList("ob1"),
                new ItemsList("ob3")
        };

"ob1" stands for itemId

My goal is to get the result array like this ["ob1", "ob2", "ob3"], but given NullPointerException when trying to find objects that aren't doubled and add those to array.

Note: cannot use Set, HashSet, ArrayList, Arrays.copyOf, Sort etc. or any other tools such as iterators.

So far I've done this:

public String[] removeDuplicates(ItemsList[] objects) {
        String[] noDubs = new String[objects.length];
        int len = objects.length;
        int pos = 0;

        for (int i = 0; i < len; i++) {
            for (int j = i + 1; j < len; j++) {
                if (objects[i].getItemId().equals(objects[j].getItemId())) {
                    noDubs[pos] = objects[i].getItemId();
                    pos++;
                }
                else {
                    //NullPointerException given
                    if (!objects[i].getItemId().equals(objects[j].getItemId()) && !objects[i].getItemId().contains(noDubs[i])) {
                        noDubs[pos] = objects[i].getItemId();
                        pos++;
                    }
                }
            }

        }
        String[] result = new String[pos];
        for(int k = 0; k < pos; k++) {
            result[k] = noDubs[k];
        }
        return result;
    }

getItemId is class ItemsList method

3 Answers3

0

Here is an implementation. Just mark the item as duplicated if it is equal to one of the next elements. Later add it to the output array, which will need to grow.

    public String[] removeDuplicates(ItemList[] items) {
        String[] output = {};
        for (int i = 0; i < items.length; i++) {
            boolean isDuplicated = false;
            ItemList current = items[i];            
            if (current == null)
                throw new RuntimeException("item can not be null");
            for (int j = i + 1; j < items.length; j++) {
                if (current.equals(items[j])) {
                    isDuplicated = true;
                    break;
                }
            }
            if (!isDuplicated) {
                String[] temp = new String[output.length + 1];
                System.arraycopy(output, 0, temp, 0, output.length);
                temp[output.length] = current.getItemId();
                output = temp;
            }
        }
        return output;
    }

You could also make every duplicate null and later add each non null element to the output array, like this:

    public String[] removeDuplicates2(ItemList[] items) {
        String[] temp = new String[items.length];
        int inTemp = 0;
        for (ItemList item : items) {
            boolean isDuplicated = false;
            if (item == null)
                throw new RuntimeException("item can not be null");
            for (int i = 0; i < inTemp; i++) {
                if (item.getItemId().equals(temp[i])) {
                    isDuplicated = true;
                    break;
                }
            }
            if (!isDuplicated) temp[inTemp++] = item.getItemId();
        }
        String[] output = new String[inTemp];
        System.arraycopy(temp, 0, output, 0, inTemp);
        return output;
    }

But note that the first solution is faster

pablozoani
  • 33
  • 5
  • Replace the System.arraycopy statement with a another loop if you don't want to use it. – pablozoani Feb 14 '21 at 07:50
  • Is there anyother way, I can do it without the arraycopy? :) – TheFoxterGirl Feb 14 '21 at 18:11
  • Yes, replace this line of code: `System.arraycopy(output, 0, temp, 0, output.length);` with `for (int j = 0; j < output.length; j++) temp[j] = output[j];`. But I do not recommend that beacuse of https://stackoverflow.com/questions/18638743/is-it-better-to-use-system-arraycopy-than-a-for-loop-for-copying-arrays – pablozoani Feb 14 '21 at 19:51
  • Oh thank you, I'm just not quite sure if I'm allowed to use it, so that's why I'm asking for another way around :D – TheFoxterGirl Feb 15 '21 at 15:08
  • If it's an educational exercise it is understandable. Don't do it in production! ;) – pablozoani Feb 15 '21 at 20:06
0

It would be better to use an intermediate array of booleans to track the duplicates and define the length of the result array. This would facilitate detection of multiple duplicates when the same element might occur more than 2 times.

Also, we need to make sure the method equals (and possibly hashCode) is properly overridden in ItemList:

// class ItemList
public boolean equals(Object o) {
    if (null == o || !(o instanceof ItemList)) {
        return false;
    }
    if (o == this) return true;
    ItemList that = (ItemList) o;
    return Objects.equals(this.itemId, that.itemId);
}

public int hashCode() {
    return Objects.hash(this.itemId);
}

When checking for equality of ItemList it may be better to use Objects.equals to handle null values without throwing a NullPointerException. Thus, duplicate null entries in the input items will be filtered out too.

public static String[] removeDuplicates(ItemList[] items) {
    final int n = items.length;
    if (n < 1) {
        return new String[0];
    }
    boolean[] dups = new boolean[n];
    int dupCount = 0;

    for (int i = 0; i < n; i++) {
        ItemList current = items[i];
        for (int j = i + 1; j < n; j++) {
            if (dups[j]) {
                continue;
            }
            if (Objects.equals(current, items[j])) {
                dups[j] = true;
                dupCount++;
            }
        }
    }
    String[] output = new String[n - dupCount];
    for (int i = 0, j = 0; i < n; i++) {
        if (!dups[i]) {
            output[j++] = null == items[i] ? "<NULL>" : items[i].getItemId();
        }
    }
    // info message
    System.out.printf("Found and removed %d duplicate value%s%n", dupCount, dupCount != 1 ? "s" : "");
    return output;
}

Test:

ItemList[] items = {
    null, new ItemList("ob1"), new ItemList("ob2"), new ItemList("ob2"), new ItemList("ob1"),
    new ItemList("ob3"), null, new ItemList("ob3"), new ItemList(null), new ItemList("ob5"),
    new ItemList("ob2"), new ItemList(null), new ItemList("ob4"), new ItemList("ob5"), null,
};

System.out.println(Arrays.toString(removeDuplicates(items)));

// compare removal of duplicates to using set
System.out.println("\nUsing Set");
Set<ItemList> set = new LinkedHashSet<>(Arrays.asList(items));  
System.out.println(set);

Output:

Found and removed 8 duplicate values
[<NULL>, ob1, ob2, ob3, null, ob5, ob4]

Using Set
[null, {id=ob1}, {id=ob2}, {id=ob3}, {id=null}, {id=ob5}, {id=ob4}]
Nowhere Man
  • 19,170
  • 9
  • 17
  • 42
0

Here is another option. This copies entries to a buffer the first time it finds them, and then copies the buffer to a correctly sized array.

There is an advantage to looking for duplicates in the buffer, instead of the original array. If there are a lot of duplicates, then there will be fewer checks when looking in the buffer compared to when looking in the original array.

I pulled out the loop to check if an item is in the buffer into another function. This avoids nesting the for loops, which can make it easier to read.

I think this overall approach also reduces the number of variables required to keep track, which also helps make it easier to read.

private static ItemsList[] removeDuplicates(ItemsList[] arr) {
    ItemsList[] buffer = new ItemsList[arr.length];
    int bufferLength = 0;
    
    for (ItemsList candidate : arr) {
        if (!isInBuffer(candidate, buffer, bufferLength)) {
            buffer[bufferLength] = candidate;
            bufferLength++;
        }
    }   
    
    ItemsList[] result = new ItemsList[bufferLength];
    for (int i = 0; i < bufferLength; i++) {
        result[i] = buffer[i];
    }
    return result;
}

private static boolean isInBuffer(ItemsList candidate, ItemsList[] buffer, int bufferLength) {
    for(int i = 0; i < bufferLength; i++) {
        if (Objects.equals(candidate, buffer[i])) {
            return true;
        }
    }
    return false;
}
Chris Foley
  • 454
  • 3
  • 5