Create an ArrayList of unique values

Question

I have an ArrayList with values taken from a file (many lines, this is just an extract):

20/03/2013 23:31:46 6870    6810    6800    6720    6860    6670    6700    6650    6750    6830    34864   34272
20/03/2013 23:31:46 6910    6780    6800    6720    6860    6680    6620    6690    6760    6790    35072   34496

Where the first two values per line are strings that contain data and are stored in a single element.

What I want to do is compare the string data elements and delete, for example, the second one and all the elements referred to in that line.

For now, I've used a for loop that compares the string every 13 elements (in order to compare only data strings).

My question: can I implement other better solutions?

This is my code:

import java.util.Scanner;
import java.util.List;
import java.util.ArrayList;
import java.io.*;
import java.text.SimpleDateFormat;
import java.util.Date;

public class Main {
    public static void main(String[] args) throws Exception{

        //The input file
        Scanner s = new Scanner(new File("prova.txt"));

        //Saving each element of the input file in an arraylist 
        ArrayList<String> list = new ArrayList<String>();
        while (s.hasNext()){
            list.add(s.next());
        }
        s.close();

        //Arraylist to save modified values
        ArrayList<String> ds = new ArrayList<String>();

        //
        int i;
        for(i=0; i<=list.size()-13; i=i+14){

            //combining the first to values to obtain data  
            String str = list.get(i)+" "+list.get(i+1);
            ds.add(str);
            //add all the other values to arraylist ds
            int j;
            for(j=2; j<14; j++){
                ds.add(list.get(i+j));
            }

            //comparing data values
            int k;  
            for(k=0; k<=ds.size()-12; k=k+13){
                ds.get(k); //first data string element  
                //Comparing with other strings and delete
                //TODO  
            }
        }
    }
}

You should post your question here: http://codereview.stackexchange.com/ — JREN, Jul 09 '13 at 11:42
Code is not completed 1 braces is missing so please post complte code — Ashish Aggarwal, Jul 09 '13 at 12:03
@AshishAggarwal, now should be ok although the comparing values part isn't not implemented at all — alessandrob, Jul 09 '13 at 13:14

score 76 · Answer 1 · answered Jun 05 '14 at 12:39

76

Try checking for duplicates with a .contains() method on the ArrayList, before adding a new element.

It would look something like this

   if(!list.contains(data))
       list.add(data);

That should prevent duplicates in the list, as well as not mess up the order of elements, like people seem to look for.

answered Jun 05 '14 at 12:39

Anudeep Bulla

8,318
4
22
29

6

This will work but don't forget to make this synchronized, otherwise you asking for trouble – Dmitri Nov 09 '14 at 23:39
2

Does this affect performance if it has large data? I think it will. – Amt87 May 30 '16 at 09:44
1

@Amt87 It is a membership test. Should be a O(log n) call at most. If you would like to retain the order of the elements, I don't think we can best it. – Anudeep Bulla May 31 '16 at 20:56
3

As @Amt87 mentioned performance will be affected for large data. Membership test in a regular ArrayList is O(n), not O(log n). So you are looking at O(n^2) when you are populating the list. – Incassator Aug 02 '16 at 00:26

score 49 · Accepted Answer · answered Jul 09 '13 at 11:44

49

Create an Arraylist of unique values

You could use Set.toArray() method.

A collection that contains no duplicate elements. More formally, sets contain no pair of elements e1 and e2 such that e1.equals(e2), and at most one null element. As implied by its name, this interface models the mathematical set abstraction.

http://docs.oracle.com/javase/6/docs/api/java/util/Set.html

answered Jul 09 '13 at 11:44

Dmitry Zagorulkin

8,370
4
37
60

Ok, but how i can do to delete all the values related to the data string? I mean, if i find a data string equal to another, i must delete all the values related to this – alessandrob Jul 09 '13 at 12:44
10

`Set.toArray()` doesn't keep the elements order. Is there a way to keep the order? – Italo Borssatto Dec 10 '13 at 14:52
@italo Why -1? please explain – Dmitry Zagorulkin Dec 13 '13 at 11:21
4

An ArrayList keep the elements order, Set doesn't. An ArrayList of unique values is not the same of transforming a Set to an ArrayList. I came to this question looking for an ArrayList of unique values and I didn't find it yet. – Italo Borssatto Dec 13 '13 at 13:08
1

Complementing... Looking for a good/smart implementation for an ArrayList of unique values. – Italo Borssatto Dec 13 '13 at 13:34
6

@italo you could potentially use LinkedHashSet: http://docs.oracle.com/javase/8/docs/api/java/util/LinkedHashSet.html which will retain the insertion order. Depends on what order you wish to maintain for duplicates. – Ben Neill Oct 13 '14 at 22:16

score 19 · Answer 3 · answered Feb 10 '15 at 09:41

19

HashSet hs = new HashSet();
                hs.addAll(arrayList);
                arrayList.clear();
                arrayList.addAll(hs);

answered Feb 10 '15 at 09:41

Amol Suryawanshi

2,108
21
29

1

This provides unique records as in the question but if you need to keep the order of the data this will not work. – M.Selman SEZGİN Apr 22 '20 at 15:11

MC Emperor · Answer 4 · 2022-07-07T12:16:33.190

16

Pretty late to the party, but here's my two cents:

Use a `LinkedHashSet`

I assume what you need is a collection which:

disallows you to insert duplicates;
retains insertion order.

LinkedHashSet does this. The advantage over using an ArrayList is that LinkedHashSet has a complexity of O(1) for the contains operation, as opposed to ArrayList, which has O(n).

^{Of course, you need to implement your object's equals and hashCode methods properly.}

edited Jul 07 '22 at 12:16

answered Aug 20 '18 at 10:06

MC Emperor

22,334
15
80
130

3

Very underrated answer ! – Ar3s Feb 23 '20 at 12:07
Much simpler solution that additionally using contains, major +1 – Darth Jul 07 '22 at 12:11

score 7 · Answer 5 · answered Nov 13 '19 at 09:19

7

If you want to make a list with unique values from an existing list you can use

List myUniqueList = myList.stream().distinct().collect(Collectors.toList());

answered Nov 13 '19 at 09:19

Cptn Slow

71
1
2

This is the third time this exact snippet is posted as a solution... – glace Nov 13 '19 at 09:29

score 6 · Answer 6 · answered Jan 17 '17 at 14:51

6

 //Saving each element of the input file in an arraylist 
    ArrayList<String> list = new ArrayList<String>();
    while (s.hasNext()){
        list.add(s.next());
    }

//That's all you need
list = (ArrayList) list.stream().distinct().collect(Collectors.toList());

answered Jan 17 '17 at 14:51

Sadequer Rahman

133
5
11

It is working from api level 24. doesn't support below versions – Sathish Gadde Nov 13 '18 at 07:18

Arnaud Denoyelle · Answer 7 · 2013-07-09T11:51:45.260

You can easily do this with a Hashmap. You obviously have a key (which is the String data) and some values.

Loop on all your lines and add them to your Map.

Map<String, List<Integer>> map = new HashMap<>();
...
while (s.hasNext()){
  String stringData = ...
  List<Integer> values = ...
  map.put(stringData,values);
}

Note that in this case, you will keep the last occurence of duplicate lines. If you prefer keeping the first occurence and removing the others, you can add a check with Map.containsKey(String stringData); before putting in the map.

score 4 · Answer 8 · answered Jul 09 '13 at 11:50

4

Use Set

      ...
      Set<String> list = new HashSet<>();
      while (s.hasNext()){
         list.add(s.next());
      }
      ...

answered Jul 09 '13 at 11:50

Mohammad Changani

474
3
11

ggorlen · Answer 9 · 2020-01-04T22:10:28.557

Solution #1: `HashSet`

A good solution to the immediate problem of reading a file into an ArrayList with a uniqueness constraint is to simply keep a HashSet of seen items. Before processing a line, we check that its key is not already in the set. If it isn't, we add the key to the set to mark it as finished, then add the line data to the result ArrayList.

import java.util.*;
import java.io.*;

public class Main {
    public static void main(String[] args) 
        throws FileNotFoundException, IOException {

        String file = "prova.txt";
        ArrayList<String[]> data = new ArrayList<>();
        HashSet<String> seen = new HashSet<>();

        try (BufferedReader br = new BufferedReader(new FileReader(file))) {
            for (String line; (line = br.readLine()) != null;) {
                String[] split = line.split("\\s+");
                String key = split[0] + " " + split[1];

                if (!seen.contains(key)) {
                    data.add(Arrays.copyOfRange(split, 2, split.length));
                    seen.add(key);
                }
            }
        }

        for (String[] row : data) {
            System.out.println(Arrays.toString(row));
        }
    }
}

Solution #2: `LinkedHashMap`/`LinkedHashSet`

Since we have key-value pairs in this particular dataset, we could roll everything into a LinkedHashMap<String, ArrayList<String>> (see docs for LinkedHashMap) which preserves ordering but can't be indexed into (use-case driven decision, but amounts to the same strategy as above. ArrayList<String> or String[] is arbitrary here--it could be any data value). Note that this version makes it easy to preserve the most recently seen key rather than the oldest (remove the !data.containsKey(key) test).

import java.util.*;
import java.io.*;

public class Main {
    public static void main(String[] args) 
        throws FileNotFoundException, IOException {

        String file = "prova.txt";
        LinkedHashMap<String, ArrayList<String>> data = new LinkedHashMap<>();

        try (BufferedReader br = new BufferedReader(new FileReader(file))) {
            for (String line; (line = br.readLine()) != null;) {
                String[] split = line.split("\\s+");
                String key = split[0] + " " + split[1];

                if (!data.containsKey(key)) {
                    ArrayList<String> val = new ArrayList<>();
                    String[] sub = Arrays.copyOfRange(split, 2, split.length); 
                    Collections.addAll(val, sub);
                    data.put(key, val);
                }
            }
        }

        for (Map.Entry<String, ArrayList<String>> e : data.entrySet()) {
            System.out.println(e.getKey() + " => " + e.getValue());
        }
    }
}

Solution #3: `ArrayListSet`

The above examples represent pretty narrow use cases. Here's a sketch for a general ArrayListSet class, which maintains the usual list behavior (add/set/remove etc) while preserving uniqueness.

Basically, the class is an abstraction of solution #1 in this post (HashSet combined with ArrayList), but with a slightly different flavor (the data itself is used to determine uniqueness rather than a key, but it's a truer "ArrayList" structure).

This class solves the problems of efficiency (ArrayList#contains is linear, so we should reject that solution except in trivial cases), lack of ordering (storing everything directly in a HashSet doesn't help us), lack of ArrayList operations (LinkedHashSet is otherwise the best solution but we can't index into it, so it's not a true replacement for an ArrayList).

Using a HashMap<E, index> instead of a HashSet would speed up remove(Object o) and indexOf(Object o) functions (but slow down sort). A linear remove(Object o) is the main drawback over a plain HashSet.

import java.util.*;

public class ArrayListSet<E> implements Iterable<E>, Set<E> {
    private ArrayList<E> list;
    private HashSet<E> set;

    public ArrayListSet() {
        list = new ArrayList<>();
        set = new HashSet<>();
    }

    public boolean add(E e) {
        return set.add(e) && list.add(e);
    }

    public boolean add(int i, E e) {
        if (!set.add(e)) return false;
        list.add(i, e);
        return true;
    }

    public void clear() {
        list.clear();
        set.clear();
    }

    public boolean contains(Object o) {
        return set.contains(o);
    }

    public E get(int i) {
        return list.get(i);
    }

    public boolean isEmpty() {
        return list.isEmpty();
    }

    public E remove(int i) {        
        E e = list.remove(i);
        set.remove(e);
        return e;
    }

    public boolean remove(Object o) {        
        if (set.remove(o)) {
            list.remove(o);
            return true;
        }

        return false;
    }

    public boolean set(int i, E e) {
        if (set.contains(e)) return false;

        set.add(e);
        set.remove(list.set(i, e));
        return true;
    }

    public int size() {
        return list.size();
    }

    public void sort(Comparator<? super E> c) {
        Collections.sort(list, c);
    }

    public Iterator<E> iterator() {
        return list.iterator();
    }

    public boolean addAll(Collection<? extends E> c) {
        int before = size();
        for (E e : c) add(e);
        return size() == before;
    }

    public boolean containsAll(Collection<?> c) {
        return set.containsAll(c);
    }

    public boolean removeAll(Collection<?> c) {
        return set.removeAll(c) && list.removeAll(c);
    }

    public boolean retainAll(Collection<?> c) {
         return set.retainAll(c) && list.retainAll(c);
    }

    public Object[] toArray() {
        return list.toArray();
    }

    public <T> T[] toArray(T[] a) {
        return list.toArray(a);
    }
}

Example usage:

public class ArrayListSetDriver {
    public static void main(String[] args) {
        ArrayListSet<String> fruit = new ArrayListSet<>();
        fruit.add("apple");
        fruit.add("banana");
        fruit.add("kiwi");
        fruit.add("strawberry");
        fruit.add("apple");
        fruit.add("strawberry");

        for (String item : fruit) {
            System.out.print(item + " "); // => apple banana kiwi strawberry
        }

        fruit.remove("kiwi");
        fruit.remove(1);
        fruit.add(0, "banana");
        fruit.set(2, "cranberry");
        fruit.set(0, "cranberry");
        System.out.println();

        for (int i = 0; i < fruit.size(); i++) {
            System.out.print(fruit.get(i) + " "); // => banana apple cranberry
        }

        System.out.println();
    }
}

Solution #4: `ArrayListMap`

This class solves a drawback of ArrayListSet which is that the data we want to store and its associated key may not be the same. This class provides a put method that enforces uniqueness on a different object than the data stored in the underlying ArrayList. This is just what we need to solve the original problem posed in this thread. This gives us the ordering and iteration of an ArrayList but fast lookups and uniqueness properties of a HashMap. The HashMap contains the unique values mapped to their index locations in the ArrayList, which enforces ordering and provides iteration.

This approach solves the scalability problems of using a HashSet in solution #1. That approach works fine for a quick file read, but without an abstraction, we'd have to handle all consistency operations by hand and pass around multiple raw data structures if we needed to enforce that contract across multiple functions and over time.

As with ArrayListSet, this can be considered a proof of concept rather than a full implementation.

import java.util.*;

public class ArrayListMap<K, V> implements Iterable<V>, Map<K, V> {
    private ArrayList<V> list;
    private HashMap<K, Integer> map;

    public ArrayListMap() {
        list = new ArrayList<>();
        map = new HashMap<>();
    }

    public void clear() {
        list.clear();
        map.clear();
    }

    public boolean containsKey(Object key) {
        return map.containsKey(key);
    }

    public boolean containsValue(Object value) {
        return list.contains(value);
    }

    public V get(int i) {
        return list.get(i);
    }

    public boolean isEmpty() {
        return map.isEmpty();
    }

    public V get(Object key) {
        return list.get(map.get(key));
    }

    public V put(K key, V value) {
        if (map.containsKey(key)) {
            int i = map.get(key);
            V v = list.get(i);
            list.set(i, value);
            return v;
        }

        list.add(value);
        map.put(key, list.size() - 1);
        return null;
    }

    public V putIfAbsent(K key, V value) {
        if (map.containsKey(key)) {
            if (list.get(map.get(key)) == null) {
                list.set(map.get(key), value);
                return null;
            }

            return list.get(map.get(key));
        }

        return put(key, value);
    }

    public V remove(int i) {
        V v = list.remove(i);

        for (Map.Entry<K, Integer> entry : map.entrySet()) {
            if (entry.getValue() == i) {
                map.remove(entry.getKey());
                break;
            }
        }

        decrementMapIndices(i);
        return v;
    }

    public V remove(Object key) {
        if (map.containsKey(key)) {
            int i = map.remove(key);
            V v = list.get(i);
            list.remove(i);
            decrementMapIndices(i);
            return v;
        }

        return null;
    }

    private void decrementMapIndices(int start) {
        for (Map.Entry<K, Integer> entry : map.entrySet()) {
            int i = entry.getValue();

            if (i > start) {
                map.put(entry.getKey(), i - 1);
            }
        }
    }

    public int size() {
        return list.size();
    }

    public void putAll(Map<? extends K, ? extends V> m) {
        for (Map.Entry<? extends K, ? extends V> entry : m.entrySet()) {
            put(entry.getKey(), entry.getValue());
        }
    }

    public Set<Map.Entry<K, V>> entrySet() {
        Set<Map.Entry<K, V>> es = new HashSet<>();

        for (Map.Entry<K, Integer> entry : map.entrySet()) {
            es.add(new AbstractMap.SimpleEntry<>(
                entry.getKey(), list.get(entry.getValue())
            ));
        }

        return es;
    }

    public Set<K> keySet() {
        return map.keySet();
    }

    public Collection<V> values() {
        return list;
    }

    public Iterator<V> iterator() {
        return list.iterator();
    }

    public Object[] toArray() {
        return list.toArray();
    }

    public <T> T[] toArray(T[] a) {
        return list.toArray(a);
    }
}

Here's the class in action on the original problem:

import java.io.*;

public class Main {
    public static void main(String[] args) 
        throws FileNotFoundException, IOException {

        String file = "prova.txt";
        ArrayListMap<String, String[]> data = new ArrayListMap<>();

        try (BufferedReader br = new BufferedReader(new FileReader(file))) {
            for (String line; (line = br.readLine()) != null;) {
                String[] split = line.split("\\s+");
                String key = split[0] + " " + split[1];
                String[] sub = Arrays.copyOfRange(split, 2, split.length); 
                data.putIfAbsent(key, sub); 
            }
        }

        for (Map.Entry<String, String[]> e : data.entrySet()) {
            System.out.println(e.getKey() + " => " + 
                java.util.Arrays.toString(e.getValue()));
        }

        for (String[] a : data) {
            System.out.println(java.util.Arrays.toString(a));
        }
    }
}

score 3 · Answer 10 · answered Jul 09 '13 at 11:47

3

You could use a Set. It is a collection which doesn't accept duplicates.

answered Jul 09 '13 at 11:47

Sorin

908
2
8
19

score 2 · Answer 11 · answered Jan 22 '15 at 09:49

Just Override the boolean equals() method of custom object. Say you have an ArrayList with custom field f1, f2, ... override

@Override
public boolean equals(Object o) {
    if (this == o) return true;
    if (!(o instanceof CustomObject)) return false;

    CustomObject object = (CustomObject) o;

    if (!f1.equals(object.dob)) return false;
    if (!f2.equals(object.fullName)) return false;
    ...
    return true;
}

and check using ArrayList instance's contains() method. That's it.

score 0 · Answer 12 · answered Jul 09 '13 at 11:46

0

If you need unique values, you should use the implementation of the SET interface

answered Jul 09 '13 at 11:46

Sashi Kant

13,277
9
44
71

1

Set is not accessed by INDEX. Not the same! – marcolopes Oct 24 '18 at 01:46

score 0 · Answer 13 · answered Jul 09 '13 at 11:57

You can read from file to map, where the key is the date and skip if the the whole row if the date is already in map

        Map<String, List<String>> map = new HashMap<String, List<String>>();

        int i = 0;
        String lastData = null;
        while (s.hasNext()) {
            String str = s.next();
            if (i % 13 == 0) {
                if (map.containsKey(str)) {
                    //skip the whole row
                    lastData = null;
                } else {
                    lastData = str;
                    map.put(lastData, new ArrayList<String>());
                }
            } else if (lastData != null) {
                map.get(lastData).add(str);
            }


            i++;
        }

lapkritinis · Answer 14 · 2019-01-28T16:53:11.940

I use helper class. Not sure it's good or bad

public class ListHelper<T> {
    private final T[] t;

    public ListHelper(T[] t) {
        this.t = t;
    }

    public List<T> unique(List<T> list) {
       Set<T> set = new HashSet<>(list);
        return Arrays.asList(set.toArray(t));
    }
}

Usage and test:

import static org.assertj.core.api.Assertions.assertThat;


public class ListHelperTest {

    @Test
    public void unique() {
        List<String> s = Arrays.asList("abc", "cde", "dfg", "abc");
        List<String> unique = new ListHelper<>(new String[0]).unique(s);
        assertThat(unique).hasSize(3);
    }
}

Or Java8 version:

public class ListHelper<T> {
    public Function<List<T>, List<T>> unique() {
        return l -> l.stream().distinct().collect(Collectors.toList());
    }
}

public class ListHelperTest {
    @Test
    public void unique() {
        List<String> s = Arrays.asList("abc", "cde", "dfg", "abc");
        assertThat(new ListHelper<String>().unique().apply(s)).hasSize(3);
    }
}

Create an ArrayList of unique values

14 Answers14

Use a LinkedHashSet

Solution #1: HashSet

Solution #2: LinkedHashMap/LinkedHashSet

Solution #3: ArrayListSet

Solution #4: ArrayListMap

Linked

Use a `LinkedHashSet`

Solution #1: `HashSet`

Solution #2: `LinkedHashMap`/`LinkedHashSet`

Solution #3: `ArrayListSet`

Solution #4: `ArrayListMap`