1

I have an Iterable object called values (Iterable values), and I want to add them to a list of distinct elements.

for (Text val : values) {
    if (!mylist.contains(val)) {
                mylist.add(val);
    }
}

It onlu adds one element to this list. If I remove that condition to check for distinctness, I see that all the elements are repeated.

I have tried many things, I thought maybe I should use a .get() method like this

for (Text val : values) {
    if (!mylist.contains(val.get())) {
                mylist.add(val.get());
    }
}

but then Java gives this error, that symbol val not found:

>editorPairs.java:67: cannot find symbol
>symbol  : method get()
>location: class org.apache.hadoop.io.Text
>                    mylist.add(val.get());
>                                  ^
>1 error

The full code is below:

public void reduce(Text key, Iterable<Text> values, Context context)
                throws IOException, InterruptedException {

        List<Text> mylist = new ArrayList<Text>();

        for (Text val : values) {
            if (!mylist.contains(val)) {
                mylist.add(val);
            }
        }

        if(mylist.size() > 1) {
            int size = mylist.size();
            for (int i=0; i<size; ++i) {
                Text t1 = mylist.get(i);
                context.write(t1, t1);
            }
        }
}
fabian
  • 80,457
  • 12
  • 86
  • 114
Vahid Mirjalili
  • 6,211
  • 15
  • 57
  • 80

2 Answers2

1

We need to use [Set][1] to get the distinct values as [set][1] doesn't add the value if it already exists (hence, no need to check for contains()). Now, to allow set to determine the unique values, we need to override equals() and hashCode() method in our class (Text in our case).

This example explains what needs to be done.

Darshan Mehta
  • 30,102
  • 11
  • 68
  • 102
  • Based on your suggestion, I guess maybe this class Text (defined in Hadoop) is not inherited from a Comparable class. I will convert the Text values to string and try see what happens. – Vahid Mirjalili Mar 05 '16 at 01:34
  • In that case, if we know the contents of Text class then we can define our own `comparator` and use `TreeSet` to store the values as explained here : http://stackoverflow.com/questions/14880450/java-hashset-with-a-custom-equality-criteria. – Darshan Mehta Mar 05 '16 at 01:47
0

the better thing to do is to use a set.

instantiate a HashSet that use equals method of your object to add values only if distint.