0

Trying to extract data from multiple SQL tables. I have a main table and a couple of sub-tables. I want to get all the rows from the main table but this creates a duplicate record.

Edit:

Lets say , I want to display all the topics along with its status(ch_status). If two topics marked as inprogress and complete then complete will show. Here is the flow , I have Topics table which contains several topics. Now there is also a table named used_topics_chapters which will contain topic along with chapters. When lets say user start chapter , a row will be inserted into complete table which will contain progress status along with user id.

let say there are 5 rows as follows .

1  Topic1  InProgress 1
2  Topic1  Complete   0
3  Topic2  AnotherStatus 1
4 Topic3   NoStatus   1
5  Topic2  complete  0

Then its should display as follows,

1  Topic1  Complete   0
2  Topic2  Complete   0
3 Topic3   NoStatus   1

Here is a example fiddle with sample data and what I have tried so far:

Create table topics
(
  id int,
  name varchar(10)
 
)

insert into topics values(1,'Topic 1'),(2,'Topic 2'),(3,'Topic 3')

--
create table used_topics_chapters(id int,topic_id int,name varchar(10))

insert into used_topics_chapters values(1,1,'Chap1'),(2,3,'Chap 2'),(3,3,'Chap3')

-- 
create table completed
(
  id int,
  chp_id int,
  ch_status varchar(20),
  user_id int
)

insert into completed values(1,3,'complete',100),(2,2,'In-Progress',101)

select t.id,t.name,c.ch_status,
case c.ch_status when 'complete' then 0 else 1 end as can_modify
from topics t
left join used_topics_chapters as utc on t.id=utc.topic_id
left join completed as c on c.chp_id=utc.id

http://sqlfiddle.com/#!18/da7d17/2

On the output You could see that Topic 3 is showing 2 times because status is different .

What I want is , if any of the status is complete then only show that.

Ouput what I am getting ,

+----+---------+-------------+------------+
| id | name    | ch_status   | can_modify |
+----+---------+-------------+------------+
| 1  | Topic 1 | (null)      | 1          |
+----+---------+-------------+------------+
| 2  | Topic 2 | (null)      | 1          |
+----+---------+-------------+------------+
| 3  | Topic 3 | In-Progress | 1          |
+----+---------+-------------+------------+
| 3  | Topic 3 | complete    | 0          |
+----+---------+-------------+------------+

Output what I am expecting ,

+----+---------+-----------+------------+
| id | name    | ch_status | can_modify |
+----+---------+-----------+------------+
| 1  | Topic 1 | (null)    | 1          |
+----+---------+-----------+------------+
| 2  | Topic 2 | (null)    | 1          |
+----+---------+-----------+------------+
| 3  | Topic 3 | complete  | 0          |
+----+---------+-----------+------------+
Jerome Taylor
  • 281
  • 2
  • 12
  • 1
    What if it has 2 statuses that *aren't* complete? – Thom A Feb 23 '23 at 09:39
  • If null / Inprogress then can_modify=1 – Jerome Taylor Feb 23 '23 at 09:40
  • 1
    That doesn't answer my question. If there are 2 rows with a status that isn't complete are you expecting both of those rows? If there are 3 are you expecting all 3, just 2? None? What is there are 2+ rows with complete? – Thom A Feb 23 '23 at 09:41
  • That doesn't answer my question. If there are 2 rows with a status that isn't complete are you expecting both of those rows? Yes with can_modify 1 – Jerome Taylor Feb 23 '23 at 09:44
  • 1
    So what is the logic here? You need to elaborate. YOu've told us what you want for your one example, but we don't have enough scenarios to guess what the rest are. I *initially* thought this might be a [Get top 1 row of each group](https://stackoverflow.com/questions/6841605/get-top-1-row-of-each-group), but as you expect 2+ rows for some (unknown) scenarios, that isn't the case. Perhaps, however, it is a "top 1 in each group" but you should be using `RANK` rather than `ROW_NUMBER`? – Thom A Feb 23 '23 at 09:45
  • I agree with Larnu that information is missing. Your data model suggests that one topic can have many chapters and each chapter can have many rows in the completed table. This indicates that not a topic but a user's work on a chapter gets completed. What to do, if we find a topic with five chapters and a total of twenty completed rows of which seven have the status 'complete' for instance? What rows with what content are to be shown then? Explain precisely what rule you want to apply to get the result you are after. – Thorsten Kettner Feb 23 '23 at 10:33
  • " Your data model suggests that one topic can have many chapters and each chapter can have many rows in the completed table." – Jerome Taylor Feb 23 '23 at 10:38
  • "What to do, if we find a topic with five chapters and a total of twenty completed rows of which seven have the status 'complete' for instance? What rows with what content are to be shown then?" If there is any complete then its complete , so each row will show as completed. Now as my example 2 chapters have started but one user has completed and one not so it should mark as complete. Thanks Sorry for my bad english. – Jerome Taylor Feb 23 '23 at 10:46
  • Okay. If we find at least one 'complete' row for a topic, we show one row for the topic with ch_status = 'complete' and can_modify = 0. If we don't find any 'complete' row for a topic, but twenty rows in the completed table with other statuses, what to show then? Only one row for the topic with ch_status = null and can_modify = 1? – Thorsten Kettner Feb 23 '23 at 10:57
  • Yes if ch_status='complete' then only 0 else in any other condition , 0. Thanks – Jerome Taylor Feb 23 '23 at 11:01
  • Please use DISTINCT. – Bogdan Sahlean Feb 23 '23 at 11:18
  • SELECT * FROM (SELECT ... ROW_NUMBER() OVER(...) RRNN) x WHERE RRNN = 1. If not. – Bogdan Sahlean Feb 23 '23 at 11:19
  • @JeromeTaylor you aren't answering the question still. We aren't asking about your column we are asking about the **rows**. – Thom A Feb 23 '23 at 12:51

2 Answers2

0

There are no duplicates in the question's results. It seems the actual problem is how to get the "latest" row per topic. Rows have no implicit order or rank, so the query itself will have to specify one.

One way is to use ranking functions like ROW_NUMBER() to order result by descending status order and select the first row per topic. This means that the status values need to have some logical order. Either the values themselves need to be in logical order, or the table where they're stored in needs to have an order imposed through eg the ID or an ordering column.

In this case Complete comes before In-Progress' lexically so we can use an ascending ORDER BY` to get the "last" logical status. In a more complex example though, a proper table and order would be needed.

with x as (
    select t.id,t.name,c.ch_status,
           IIF(c.ch_status ='complete',0 ,1) as can_modify,
           ROW_NUMBER(PARTITION BY t.ID ORDER BY status) as RN
    from topics t
        left join used_topics_chapters as utc on t.id=utc.topic_id
        left join completed as c on c.chp_id=utc.id
)
select ID,Name,ch_status,can_modify
FROM x
WHERE RN=1

A CTE is needed because ranking functions can't be used in the WHERE clause.

A Table value constructor could be used to create a "table" of statuses right inside the query :

with x as (
    select t.id,t.name,c.ch_status,
           s.can_modify,
           ROW_NUMBER(PARTITION BY t.ID ORDER BY s.Priority DESC) as RN
    from topics t
        left join used_topics_chapters as utc on t.id=utc.topic_id
        left join completed as c on c.chp_id=utc.id
        left join (VALUES (1,'In-Progress',1),
                          (2,'complete',0)
                  ) as s(Priority,status,can_modify)
            ON ch_status=s.status

)
select ID,Name,ch_status,can_modify
FROM x
WHERE RN=1

This allows specifying more than one status values and extra attributes per status

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
0

You want to show one result row per topic, so select from the topics table. You want to know whether a 'complete' row exists for the topic, so use EXISTS. As both result columns can_modify and ch_status contain the same information, namely whether a 'complete' row exists or not, I am using a subquery here in order not to have to write the same condition twice:

select
  id, name, can_modify,
  case when can_modify = 0 then 'Complete' end as ch_status
from
(
  select
    t.id, t.name,
    case when exists
    (
      select null
      from used_topics_chapters as utc
      join completed as c on c.chp_id = utc.id
      where utc.topic_id = t.id
      and c.ch_status = 'complete'
    ) then 0 else 1 end as can_modify
  from topics t
) evaluated
order by id;
Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73
  • Could You please explain why You are doing case when exists () .. What is the purpose? Sorry I could not understand thats why asking – Jerome Taylor Feb 23 '23 at 13:55
  • I am creating the can_modify column. `CASE WHEN .. THEN .. ELSE .. END` is if/then/else to evaluate a value in SQL. When a 'complete' row exists then can_modify is 0 else it is 1. – Thorsten Kettner Feb 23 '23 at 14:12