24

I can't figure out why JCF (Java Collection Framework) does't have a Bag implementation(to allow duplicates and not maintain order). Bag performance would be much better than current Collection implementations in JCF.

  • I know how to implement Bag in Java.
  • I know Bag is available in Apache commons.
  • I know there are other implementations that can be used as a Bag but there is so much work to do in other implementations compared to a Bag.

Why has the Java Collections framework not provided direct implementations like this?

Koray Tugay
  • 22,894
  • 45
  • 188
  • 319
Morteza Adi
  • 2,413
  • 2
  • 22
  • 37

4 Answers4

15

Posting my comment as an answer since it answers this question best.

From the bug report filed here :

There isn't a lot of enthusiasm among the maintainers of the Collection framework to design and implement these interfaces/classes. I personally can't recall having needed one. It would be more likely that a popular package developed outside the JDK would be imported into the JDK after having proved its worth in the real world.

The need for having support for Bags is valid today.

Guava has support for it. Also GS-Collections.

Iulian Popescu
  • 2,595
  • 4
  • 23
  • 31
Ajay George
  • 11,759
  • 1
  • 40
  • 48
  • 5
    I just had a real need to use a Bag when i had to generate a hash of the collection, desconsidering sort, but considering repeated itens. List.hashcode() consider the order of elements. Set.hashset() could correctly desconsider the order, but will desconsider repeated elements as well. The only correct hashcode() impl in that case, could be the Bag implementation. It was a canonical JSON signature thing. – DLopes Jul 06 '16 at 20:35
3

Currently, bag violates the collections contract. Many methods are in conflict with the current collections rules.

"Bag is a Collection that counts the number of times an object appears in the collection. Suppose you have a Bag that contains {a, a, b, c}. Calling getCount(Object) on a would return 2, while calling uniqueSet() would return {a, b, c}.

Note that this interface violates the Collection contract. The behavior specified in many of these methods is not the same as the behavior specified by Collection. The noncompliant methods are clearly marked with "(Violation)" in their summary line. A future version of this class will specify the same behavior as Collection, which unfortunately will break backwards compatibility with this version."

 boolean add(java.lang.Object o)
      (Violation) Add the given object to the bag and keep a count.

 boolean removeAll(java.util.Collection c)
      (Violation) Remove all elements represented in the given collection, respecting cardinality.

Please see the link for more information: HERE

reevesy
  • 3,452
  • 1
  • 26
  • 23
Rahul
  • 107
  • 3
  • 1
    Include the text here from the Link. – deW1 Jul 01 '14 at 09:25
  • 3
    what is the collection contract? – Junchen Liu Oct 08 '15 at 09:55
  • The part of the documentation you've cited is not a violation. – Jimmy T. May 15 '16 at 10:41
  • @JimmyT. It does violate, as Bag is keeping a count. A Collection does not keep a count as it inserts data into a structure. The documentation of Bag itself states that it violates the rules of Collection. So does methods 'add' and 'removeAll' – Rahul May 16 '16 at 11:22
  • 2
    The violation is only the return value. The count is just a different representation of duplicates. That alone doesn't violate the interface. – Jimmy T. May 16 '16 at 11:29
0

JDK tries to give you implementation of common data structures and allow you to implement anything if common structures won't server your purpose. They may have thought that it is not common data structure.From practicality, it is not possible for them to implement every data structure out there or satisfy everybody's requirements. What you think common may not be common for majority.

Adisesha
  • 5,200
  • 1
  • 32
  • 43
  • 1
    Bag is very common in Collection area! i see many times people use Other collection implementation instead of bag. just because there is no any direct implementation of Bag in JCF so they have to pay some extra overhead of other collections. i Believe Bag is the most common collection. – Morteza Adi Mar 25 '13 at 11:30
  • 'Bag is very common in Collection area' again it's your experience.Through out my career I have to use Bag/Multiset only once. I guess it depends what you are working. – Adisesha Mar 25 '13 at 13:10
  • The issue link provided by Ajay George can answer your question. – Adisesha Mar 25 '13 at 13:12
0

One can just use a Map<Object, Long> bag instead.

The add method would look like

bag.merge(obj, 1, Integer::sum);

The remove analogous

bag.merge(obj, -1, (a, b) -> a > 1 ? a + b : 0);

That's at least the basis for Apache common-collections4 HashBag.

Valerij Dobler
  • 1,848
  • 15
  • 25