0

I have this dataset with 4 tables. I am trying to write the SQL query as following:

WITH test AS
(
    SELECT 
        (f.name), f.id, f.domain, s.link,
        (SELECT
             name,
             CASE
                WHEN name IN (1, 3, 8) THEN 1
                WHEN name IN (2, 6, 7) THEN 2
             END AS [group] 
         FROM tags 
         WHERE corporate_statement_link_id = s.id 
         FOR JSON PATH) AS tags
    FROM 
        fortune1000_companies f 
    LEFT JOIN 
        search_results s ON f.id = s.company_id
    LEFT JOIN 
        corporate_statements c ON s.id = c.corporate_statement_link_id 
    WHERE 
        c.corporate_statement = 1 
        AND s.domain LIKE CONCAT('%', f.domain, '%')
)
SELECT name, link, tags 
FROM test

but this produces the result where company names are duplicated because of the differences in link. For e.g., UnitedHeath Group (rows 4 & 5) is in two rows because the link is different. I want the result in such a way that the company name is shown just once, and tags are in the same group together. I don't need link to be shown; only included for this SO.

enter image description here

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
user1828605
  • 1,723
  • 1
  • 24
  • 63
  • one of your left joins are making it duplicate (most probably in search_results) , you need to provide sample data in each table and desired output, so we can help you – eshirvana Apr 04 '21 at 02:53
  • 1
    You’re right. It is making duplicate rows due to search_results. I tried to share sample data, but it was bit difficult to get the data especially since there’re some sensitive data to share. Is there a better way to share some data? How can I get a copy of the data from SQL server to share with the structure? – user1828605 Apr 05 '21 at 21:24
  • 1
    use https://dbfiddle.uk/ and prepare everything there and paste link into your question – eshirvana Apr 05 '21 at 21:36

1 Answers1

1

I think I figured it out.

This is what I did, and it gave me the answer I was looking for.

select name
    , (STUFF((SELECT t.[name] from tags t 
        inner join search_results s 
        on s.id = t.corporate_statement_link_id
        where f.id = s.company_id
        FOR JSON PATH),1,2,'[{')) as ts
from [fortune1000_companies] f
where f.id between 1 and 101

I got the help from here

user1828605
  • 1,723
  • 1
  • 24
  • 63