2

See this older Question

I was wondering if there are any new features available (preferably native Django) that could find DISTINCT entries in my Item Model with a certain tolerance.

A simple example; I have these 5 Item-names:

  • Item1 Linen Shirt
  • Item2 Linen Shirt
  • ItemB Linen Shirt1
  • Item Linen Skirt
  • ItemC Linen Skirt2

I would do something like:

item_set = Item.objects.distinct_special(name, tolerance = 95)

.. where the first value would be the field to search and the second value the tolerance as a percentage.

Community
  • 1
  • 1

1 Answers1

2

You can do it in pure Python with difflib.

values = """Item1 Linen Shirt
Item2 Linen Shirt
ItemB Linen Shirt1
Item Linen Skirt
ItemC Linen Skirt2"""

data = values.split('\n')

print(difflib.get_close_matches(data[0], data))

Check the documentation for get_close_matches for additional parameters like tolerance.

Matthias
  • 12,873
  • 6
  • 42
  • 48
  • Thank you Matthias. I hope searching getting all Names and ID's first of millions of records, it will not impact the performance too much :P –  Jul 15 '13 at 19:17
  • Getting all the data from the database server to the client will have an impact on performance, but at least the task can be done. Good luck. – Matthias Jul 16 '13 at 06:55