I'm working on a sort of video archive, wfsu.org/dimensions and am trying to come up with a good query/algorithm combination for related videos using mysql and php. The database has title, keywords, description, category and another normalized 1:m table for producers. I have a simple algorithm but if anyone with any wit looks at it they'll see that it produces a very poor set of 'related' videos. Any ideas or help would be appreciated!!
Per request here's the simplistic algorithm that I'm using:
//if the segment isnt a generic dimensions use this query that makes sure they're in the same category
if($segType != 2)
{
$query = "SELECT `title`, `description`, `air_date`, `keywords`, `post_id`, `img_filename`
FROM `archive_post`
WHERE `segment_type` = $segType
AND `post_id` != $id
AND NOW() > ADDTIME(`air_date`, '20:0:0')
ORDER BY `air_date` DESC LIMIT 5";
}
else //otherwise we want a query that checks to see if there are any similar keywords
{
$query = "SELECT `title`, `description`, `air_date`, `keywords`, `post_id`, `img_filename`
FROM `archive_post`
WHERE (";
$kwArray = preg_split("/[\s,-]+/", mysql_real_escape_string($keywords));
foreach($kwArray as $kw)
{
$query .= "`keywords` LIKE '%$kw%' OR";
}
$query = substr($query, 0, -3);
$query .= ")
AND `post_id` != $id
AND NOW() > ADDTIME(`air_date`, '20:0:0')
ORDER BY `air_date` DESC LIMIT 5";
}
$result = $dbConnection->runQuery($query);
if(mysql_num_rows($result) == 0) //if we can't find any 'related' videos what do?
{
}
else
{
while($row = mysql_fetch_array($result))
{
$moreTitle = $row['title'];
$moreID = $row['post_id'];
$moreDescription = cleanDescription($row['description']);
$moreDescription = substr($moreDescription, 0, 50).'...';
$moreDate = strtotime( $row['air_date'], time() );
$moreDate = date( "F j, Y" , $moreDate );
$relatedVideos .= "<li> <a href='viewvideo.php?num=$moreID'></a><h3>$moreTitle</h3>
<div class='featuredStory'><span class='featuredDate'>$moreDate</span> ⋅ $moreDescription</div></li>";
}
}