-1

I am using ContainsTable in my T-SQL statement to get most relevant content as per some given keywords in below query.

Select @Values = '"Keyword 1","Keyword 2,"Keyword 3"', 
Select @FirstLevelValues = 'Isabout ('+@Values+' Weight(0.7))'
Select @SecondLevelValues = 'Isabout ('+@Values+' Weight(0.3))'

;with cte as (
     (Select [Key], [Rank] from ContainsTable(PostsTable, Title, @FirstLevelValues))
     union
     (Select [Key], [Rank] from ContainsTable(PostsTable, Description, @SecondLevelValues))
) 
Select Top 100 P.* from PostsTable P
Left Join cte on cte.[Key] = P.ID where
P.PostDate between DATEADD(DAY, -5, GETUTCDATE()) and GETUTCDATE() 
Order by cte.[Rank] DESC, PostDate DESC

Now my problem is that I have [Rank] column, which rates a Post from PostsTable on the basis of most relevant (most keyword matches has higher ranking). But, this should work in conjunction with PostDate column. Means the most relevant and the most recent Post should be rated higher than only the most relevant post. I do understand the concept of Order by, where one resultset is ordered basis one column first followed by the second column and so on. But in my case, I want that the PostDate should also be considered a criteria, probably with a Weight as in IsAbout clause, so it can also decide the ranking of the post.

Cyberpks
  • 1,401
  • 6
  • 21
  • 51
  • What would be your criteria for weighting the `PostDate` column? Do you just want to order by that column first, then by `Rank`? – Charlieface Sep 05 '22 at 11:16
  • So what would your weighting be? Is a 5 day old post with a `Rank` of `100` higer rated than a 2 day old with a `Rank` of `80`? If so why? – Charlieface Sep 05 '22 at 11:20
  • @Charlieface, the issue here is, a 5 day old post might have the higher Rank as compared to a day old post. What I want is, like I have Weights for 'Title as 0.7' and for 'Description as 0.3', if we can add weightage for Dates too, so dates are also part of `Rank` calculation. Flipping the order, shows the most recent post which isn't really relevant. – Cyberpks Sep 05 '22 at 11:20
  • Like I said: what *do* you want? How do we know how to "weight" it? A value doesn't have a "weight" unless you are doing full-text search, so you need some kind of logic to decide that. Perhaps `ORDER BY LOG(DATEDIFF(minute, p.PostDate, GETUTCDATE()))` or similar? – Charlieface Sep 05 '22 at 11:23
  • What I want is, if we give 20% weightage to date too for `Rank` calculation (make Title 55%, Description 25%), a post with 90% relevance will be rated lower than post with 80% relevance but is more recent so it gets 20% score for the more recent date. – Cyberpks Sep 05 '22 at 11:25

1 Answers1

0

You can just flip the order of the ORDER BY columns to sort by PostDate first. But you need to change to a JOIN otherwise you will just get all Posts sorted by date.

SET @Values = '"Keyword 1","Keyword 2,"Keyword 3"';
SET @FirstLevelValues = 'Isabout ('+@Values+' Weight(0.7))';
SET @SecondLevelValues = 'Isabout ('+@Values+' Weight(0.3))';

with cte as
(
    Select [Key], [Rank]
    from ContainsTable(PostsTable, Title, @FirstLevelValues)
    union
    Select [Key], [Rank]
    from ContainsTable(PostsTable, Description, @SecondLevelValues)
) 
Select Top (100) P.*
from PostsTable P
Join cte on cte.[Key] = P.ID
where P.PostDate between DATEADD(DAY, -5, GETUTCDATE()) and GETUTCDATE() 
Order by
  p.PostDate DESC,
  cte.[Rank] DESC;

I note though that you may get duplicate results if a Key is returned in both CONTAINSTABLE functions. You then need to also to take into account the higher of the ranks.

Instead you need either a FULL JOIN:

with cte as
(
    Select
      [Key] = ISNULL(t.[Key], d.[Key]),
      Rank = CASE WHEN t.Rank < d.Rank THEN d.Rank ELSE t.Rank END  -- be careful of NULL
    from ContainsTable(PostsTable, Title, @FirstLevelValues) t
    full join ContainsTable(PostsTable, Description, @SecondLevelValues) d
      on d.[Key] = t.[Key]
) 
Select Top (100) P.*
from PostsTable P
Join cte on cte.[Key] = P.ID
where P.PostDate between DATEADD(DAY, -5, GETUTCDATE()) and GETUTCDATE() 
Order by
  p.PostDate DESC,
  cte.[Rank] DESC;

Or you can use two LEFT JOINs.

Select Top (100) P.*
from PostsTable P
Left Join ContainsTable(PostsTable, Title, @FirstLevelValues) t on t.[Key] = P.ID
left join ContainsTable(PostsTable, Description, @SecondLevelValues) d on d.[Key] = P.ID
where P.PostDate between DATEADD(DAY, -5, GETUTCDATE()) and GETUTCDATE()
  and (d.Rank is not null or t.Rank is not null)
Order by
  p.PostDate DESC,
  ISNULL(d.Rank, t.Rank) DESC;

The best option is probably to just use multiple columns in the CONTAINSTABLE, although this won't work if you want multiple weightings.

Select Top (100) P.*
from PostsTable P
Join ContainsTable(PostsTable, (Title, Description), @FirstLevelValues) ct on t.[Key] = P.ID
where P.PostDate between DATEADD(DAY, -5, GETUTCDATE()) and GETUTCDATE() 
Order by
  p.PostDate DESC,
  ct.Rank DESC;
Charlieface
  • 52,284
  • 6
  • 19
  • 43
  • Hello @charlieface can you please suggest the below ContainsTable(PostsTable, (Title, Description). the fields mentioned in brackets are column names? – Bhushan Sep 06 '22 at 03:15
  • Yes, that is the syntax `CONTAINSTABLE(table, { column_name | ( column_list ) | * } , '' [ , LANGUAGE language_term] [ , top_n_by_rank] )` see https://learn.microsoft.com/en-us/sql/relational-databases/system-functions/containstable-transact-sql?view=sql-server-ver16 – Charlieface Sep 06 '22 at 09:46