2

In the log data (specific to automotive industry) I have two files

  • Inspection log: This data has Vehicle test data before sending to market with attributes part-name(vehicle part), Repairer Comments

  • Warranty data log: This data is claim data after selling it to customer with attributes part-name, dealer comments about defect

Now we need to find similarity between inspection logs and warranty logs so that we are able to say if a vehicle has specific problem in inspection stage there is fair chance it will have other problem from warranty.

Also, Part name inspection data may not be written same in warranty data like if i have frontwheelbreakdown it may be referred as FWB in short.

How do I go ahead with the problem?

Kartheek Palepu
  • 972
  • 8
  • 29
  • Possible duplicate of [Algorithm to detect similar documents in python script](http://stackoverflow.com/questions/101569/algorithm-to-detect-similar-documents-in-python-script) – Ic3fr0g Jun 23 '16 at 06:19
  • Look at cosine similarity. Also [this](http://stackoverflow.com/questions/101569/algorithm-to-detect-similar-documents-in-python-script) and [this](http://stackoverflow.com/questions/8897593/similarity-between-two-text-documents) – Ic3fr0g Jun 23 '16 at 06:20

0 Answers0