I have a online shop application and a database of around 1000 ITEMS.
ITEM{
categories / up to 5 out of 60
types / up to 2 out of 10
styles / up to 2 out of 10
rating / 0-5
}
Now I wont to create a comparison item-to-item with predefined conditions:
- At least one common category += 25points
- At least one common type += 25p.
- If first item has no styles += 0p.
- If no styles in common -= 10p.
- For each point in rating difference -= 5p.
And store the result in a table. as item_to_item_similarity.score
.
Now I made the whole thing with a nice and shiny PHP functions and classes ..
And a function to calculate and update all the relations.
In the test withs 20 items .. all went well. But when increased the test data to 1000 items .. resulting in 1000x1000 relations The server started complaining about script_time_out .. and out of memory :)
Indexes, transaction and pre-loading some of the data .. helped me half the way.
Is there a smarter way to compare and evaluate this type of data?
I was thinking to represent the related categories, styles etc.
as a set of IDs, possibly in some binary mask .. so that they can be easily compared
(even in the SQL ?) with out the need to create classes, and loops trough arrays millions of times.