-1

I have a Select Statement with a lot of data, over 23K, it's joined with over 5 tables one of which is causing the duplicates when I looked at the table I see that the join I am doing have 2 rows matching,

I tried doing a Group by but that didn't work, I also tried to Select DISTINCT but that also didn't work, how would I go about this?

SELECT DISTINCT FirstName, LastName, F.Interview  FROM tbData D
LEFT JOIN tbInterview F on D.UserID = F.UserID
 where CreatedDate is between '' and '' 

That returned Duplicates because the tbInterview has more than one Interviews linking the USerID, Then I tried This

SELECT DISTINCT FirstName, LastName, F.Interview  FROM tbData D
LEFT JOIN (Select UserID from tbInterview GROUP BY UserID) as InterviewID ON D.UserID = InterviewID.UserID
LEFT JOIN tbInterview F on InterviewID.UserID = F.UserID
 where CreatedDate is between '' and ''

This didn't work either.

Here is an example of data in the table tbInterview ╔═════════════╤════════╤═════════════╤═════════════════════════════╗ ║ InterViewID │ UserID │ DateCreated │ Interview ║ ╠═════════════╪════════╪═════════════╪═════════════════════════════╣ ║ 1 │ 120 │ 2015/05/10 │ Inter View Done ║ ╟─────────────┼────────┼─────────────┼─────────────────────────────╢ ║ 2 │ 120 │ 2015/05/15 │ 2nd Interview was requested ║ ╚═════════════╧════════╧═════════════╧═════════════════════════════╝

Now when I do a Select with the Join to the tbInterview it shows the output as follows: ╔═══════════╤══════════╤═════════════════════════════╗ ║ FirstName │ LastName │ Interview ║ ╠═══════════╪══════════╪═════════════════════════════╣ ║ James │ Smith │ Inter View Done ║ ╟───────────┼──────────┼─────────────────────────────╢ ║ James │ Smith │ 2nd Interview was requested ║ ╚═══════════╧══════════╧═════════════════════════════╝

Ahmad
  • 315
  • 4
  • 14
  • 2
    That's expected behavior. What row from Interview would you like to use to have only one row per user? – Luis Cazares Apr 22 '19 at 19:54
  • https://stackoverflow.com/help/how-to-ask – Eric Apr 22 '19 at 19:55
  • @LuisCazares I Understand and that's part of the issue I need over 20 Rows from the table with dups – Ahmad Apr 22 '19 at 19:56
  • 1
    Please include some sample data from both tables and what your desired results would be. Right now it sounsd like you need duplicates, but are complaining about duplicates. It's not terribly clear what you are actually after here. – JNevill Apr 22 '19 at 20:13
  • 1
    Does tbInterview have any sort of column that you can order the table by? In other words how would you order the data if you did SELECT * FROM tbInterview WHERE UserId = someUserId ORDER BY whatColumn? – BillRuhl Apr 22 '19 at 20:17
  • @BillRuhl I have a column there names DateCreated I could go by that, the latest date created – Ahmad Apr 22 '19 at 20:29
  • 1
    Hi @Ahmad! If you could provide a bit more info (namely an example of your current output, and your desired output) we'll be much more able to help you :). – Cowthulhu Apr 22 '19 at 20:34
  • I have Added the example of what I have in the tbInterview and what the Output looks like – Ahmad Apr 22 '19 at 20:53
  • 1
    We can't see the data so it's difficult to give you an exact solution, but it sounds like you should not be using DISTINCT. Instead partition the table by userid. Somthing like "ROW_NUMBER() OVER(PARTITION BY UserId ORDER BY DateCreated DESC) rowNo" then select where rowNo = 1 – BillRuhl Apr 22 '19 at 20:54
  • @BillRuhl I added the data example – Ahmad Apr 22 '19 at 20:56
  • Please in code questions give a [mcve]--cut & paste & runnable code & desired output & clear specification & explanation. That includes the least code you can give that is code that you show is OK extended by code that you show is not OK. (Debugging fundamental.) PS Yet again: *What data do you want with a left table row from the many right table rows that match it?* PS Clarify via edits, not comments. – philipxy Apr 22 '19 at 20:56
  • This is (going to be when clarified) a faq. Before considering posting please always google your error message or many clear, concise & precise phrasings of your question/problem/goal, with & without your particular strings/names, & read many answers. If you post a question, use one phrasing as title. See [ask] & the voting arrow mouseover texts. PS There is an entire tag for these questions: [tag:greatest-n-per-group]. PS I just googled 'latest date site:stackoverflow.com join' to a trillion answers. To duplicate questions, due to all the people who don't google for even a few relevant words. – philipxy Apr 22 '19 at 21:06
  • Possible duplicate of [SQL join: selecting the last records in a one-to-many relationship](https://stackoverflow.com/q/2111384/3404097) – philipxy Apr 22 '19 at 21:08

1 Answers1

0

Assuming that you are looking for the latest date created row in tbInterview, then you have several possible solutions. Here is one that works with the ROWNUMBER() function, based off of your first query:

This assumes that you actually only want information where the CreatedDate value(Assumed to be on the tbData table, since DateCreated is named as the equivalent on tbInterview) is within a range, as specified in your sample query.

SELECT 
    FirstName, 
    LastName, 
    F.Interview
FROM tbData D
LEFT JOIN (
    SELECT 
        UserID,
        CreatedDate,
        ROW_NUMBER() OVER (Partition By UserId ORDER BY DateCreated Desc) as RowNumber
    FROM tbInterview F
    WHERE DateCreated > '' --- Earliest date created for range on tbData
    ) as F
    on D.UserID = F.UserID
WHERE CreatedDate is between '' and '' 
    AND RowNumber = 1

This will associate the latest interview row (based on DateCreated) with the tbData user.

Laughing Vergil
  • 3,706
  • 1
  • 14
  • 28