1

I have the following input for my problem statement:-

 ID  -> List of Words
(101 -> Array("a1","b2","c4","d2"))
(102 -> Array("a6","b1","c5","d3"))
(103 -> Array("a1","b4","c4","d2"))
(104 -> Array("a2","b2","c3","d2"))
(105 -> Array("a7","b6","c1","d3"))

Now, I want to find out the similarity between these input statements.

Example:-

(101 -> Array("a1","b2","c4","d2"))
(103 -> Array("a1","b4","c4","d2"))
(104 -> Array("a2","b2","c3",",d2"))

In Example output the statements are much similar to each other.

How can I achieve this Using Spark? I can use any logical code or any machine learning Algorithm.

Thanks

Charmy Garg
  • 291
  • 2
  • 14

0 Answers0