2

I have this problem in calculating Jaccard Similarity for finding similar books using transaction id from MySQL database of sales transactions :

t1= Java,Ruby,C

t2= Java,C#, Python

t3= C#, VB, C

....etc

Size of Java intersection = 2; (How could we find it out?)

Size of union = 3, (How could we find it out?)

Jaccard similarity = (intersection/union) = 2/3

But I don't understand how could I find out the "intersection" and "union" of the two vectors or how to implement it in Java/JSP.

Please help me and thanks a lot!

Kimberly
  • 21
  • 3
  • What does the data look like in the MySQL database? What is the definition of union? What is the definition of intersection? – Gordon Linoff Mar 05 '13 at 16:31
  • Data in MySQL database is transaction id, books name and customer id/name. I want to find out the most bought books (intersection)in each transaction out of all the sales transaction (union). – Kimberly Mar 05 '13 at 16:36
  • Might be useful same approach using Arraylist - http://stackoverflow.com/questions/5283047/intersection-union-of-arraylists-in-java – Ravindra Gullapalli Mar 05 '13 at 19:39

1 Answers1

0

You need to use one of standard Set class. You can do an intersect, union and size calculation on sets.

Konstantin V. Salikhov
  • 4,554
  • 2
  • 35
  • 48
  • Hi Konstantin, thanks a lot and I'm still quite confused about of how to use Set class. Therefore is there any example implementation that you might know of which will give me a clear picture.Thanks again! – Kimberly Mar 05 '13 at 16:38