-1

SQL Vertica Question (but any SQL query answer is appreciated) I have a table "Base" currently in "day-url" level, and I try to get 'score' for each url in each day by joining a "day-url" level reference table "Score Update". The problem is the reference table is not in per day but specific dates level. I need to join the score from "Score Update" with "Update Date" right before the "Date" in "Base". Example below: (Please see pictures at bottom)

What I have: (Base)

Date          URL   Score                      
2019-08-01    A                        
2019-08-01    B                                      
2019-08-02    A                                 
2019-08-03    A                                  
2019-08-09    B         
2019-08-10    A         
2019-08-11    A          
2019-08-11    B      
2019-08-13    B

(Score Update)

Update_Date      URL    Score
 2019-07-25      A       x
 2019-07-25      B       y
 2019-08-10      A      alpha
 2019-08-10      B      beta

What I want is: What I have: (Base)

Date            URL   Score             
2019-08-01      A      x                    
2019-08-01      B      y                      
2019-08-02      A      x                           
2019-08-03      A      x                             
2019-08-09      B      y   
2019-08-10      A      alpha   
2019-08-11      A      alpha    
2019-0B-11      B      beta 
2019-08-13      B      beta

enter image description here

And both tables will keep updating for future date data. Can anyone help me how to join these 2 tables?

Thanks!

marcothesane
  • 6,192
  • 1
  • 11
  • 21
Virgil
  • 9
  • 3
  • I would left outer join `on base.url = score.url and base.date>= score.date`. I would use that as a subquery and group the results by base.url/base.date and select base.url,base.date, max(score.date). This will provide the pairing of base.url/base.date with the most recent score.url/score.date. From there, you join base on base.url/base.date and score on score.url/score.date to finally get the score.score you want. It seems like there might be a clever window function to do it, though. – avery_larry Oct 09 '19 at 15:57
  • Please [use text, not images/links, for text--including tables & ERDs.](https://meta.stackoverflow.com/q/285551/3404097) Paraphrase or quote from other text. Give just what you need & relate it to your problem. Use images only for what cannot be expressed as text or to augment text. Images cannot be searched for or cut & pasted. Include a legend/key & explanation with an image. Make your post self-contained. – philipxy Oct 10 '19 at 03:02
  • Please in code questions give a [mre]--cut & paste & runnable code; example input (as initialization code) with desired & actual output (including verbatim error messages); tags & versions; clear specification & explanation. That includes the least code you can give that is code that you show is OK extended by code that you show is not OK. (Debugging fundamental.) For SQL that includes DBMS/product & DDL, which includes constraints & indexes & tabular formatted initialization. Show any relevant parts you can do. – philipxy Oct 10 '19 at 03:03
  • This is not clear. Use enough words, sentences & references to parts of examples to clearly & fully say what you mean. To describe a result: Say enough that someone could go away & come back with a solution. When giving a business relation(ship)/association or table (base or query result), say what a row in it states about the business situation in terms of its column values. – philipxy Oct 10 '19 at 03:05

1 Answers1

0

Do you mean this?

WITH
-- your input ...
base(dt,url,score) AS (
          SELECT DATE '2019-08-01','A',NULL
UNION ALL SELECT DATE '2019-08-01','B',NULL
UNION ALL SELECT DATE '2019-08-02','A',NULL
UNION ALL SELECT DATE '2019-08-03','A',NULL
UNION ALL SELECT DATE '2019-08-04','B',NULL
UNION ALL SELECT DATE '2019-08-05','A',NULL
UNION ALL SELECT DATE '2019-08-06','A',NULL
UNION ALL SELECT DATE '2019-08-10','A',NULL
UNION ALL SELECT DATE '2019-08-11','A',NULL
UNION ALL SELECT DATE '2019-08-11','B',NULL
UNION ALL SELECT DATE '2019-08-12','A',NULL
UNION ALL SELECT DATE '2019-08-13','A',NULL
)
,
-- your input ...
score_upd(upd_dt,url,score) AS (
          SELECT DATE '2019-07-25','A','x'
UNION ALL SELECT DATE '2019-07-25','B','alpha'
UNION ALL SELECT DATE '2019-08-10','A','y'
UNION ALL SELECT DATE '2019-08-10','B','beta'
)
-- real select starts here ..                                     
SELECT
  b.dt
, b.url
, u.score
FROM base b
LEFT JOIN score_upd u
  ON b.url=u.url
 AND b.dt INTERPOLATE PREVIOUS VALUE u.upd_dt
ORDER BY dt,url
;
-- out      dt     | url | score 
-- out ------------+-----+-------
-- out  2019-08-01 | A   | x
-- out  2019-08-01 | B   | alpha
-- out  2019-08-02 | A   | x
-- out  2019-08-03 | A   | x
-- out  2019-08-04 | B   | alpha
-- out  2019-08-05 | A   | x
-- out  2019-08-06 | A   | x
-- out  2019-08-10 | A   | y
-- out  2019-08-11 | A   | y
-- out  2019-08-11 | B   | beta
-- out  2019-08-12 | A   | y
-- out  2019-08-13 | A   | y
-- out (12 rows)
marcothesane
  • 6,192
  • 1
  • 11
  • 21
  • Only Vertica has the INTERPOLATE PREVIOUS VALUE predicate on an outer join. It was built for that. – marcothesane Oct 10 '19 at 21:58
  • Sure, took some time to see the "check" for answer. I have one more question, what if in the Base table, I only have the consecutive dates from '7-25-2019' till Current_date, and I need join them based on same logic (means I wanna get a daily url score table based on most recent previous date with score) Details see here: https://stackoverflow.com/questions/58379960/sql-vertica-how-to-generate-daily-rows-with-most-previous-date-data – Virgil Oct 14 '19 at 20:43