0

ANSWERED! CTE is the best method, using descriptive dummyvariable names

I have two relevant tables concept which describes all of the database wide concepts (concept_code) from different vocabularies (vocabulary_id), and relates them to a unique, standardized ID concept_id, and concept_relationship which only has three columns, concept_id_1, concept_id_2, and relationship_id. Relationship ID describes how ID one relates to ID 2. The database team attempts to standardize concepts into one ID as much as possible using the OMOP standards, so many IDs that are at source from one vocabulary get remapped to their corresponding parent standardized ID using the "Maps to" relationship. I have sets of codes which correspond to IDs that were defined by subject matter experts in specific vocabularies that don't necessarily include what the database team chose as their standardized concept. The end product I want is a table representing both the codes/IDs defined by the SMEs, and the corresponding IDs that they "map to".

########################################################################

The new code that seems to be behaving as intended uses a WITH statement:


WITH query1 AS (
  SELECT 
    * 
  FROM 
    concept 
  WHERE 
    concept_code IN ("398.4", "I45")
    AND (
      vocabulary_id IN ("ICD9","ICD10")
    )
)
SELECT *
FROM concept
WHERE concept_id IN (
    SELECT concept_id_2
    FROM concept_relationship cr
    INNER JOIN query1 q1 ON cr.concept_id_1 = q1.concept_id
    WHERE relationship_id = "Maps to"
)
UNION 
SELECT * FROM query1

########################################################################

I have tried the query below:

SELECT 
  foo.* 
FROM 
  (
    SELECT 
      * 
    FROM 
      concept 
    WHERE 
      concept_code IN ("398.4", "I45") 
      AND (
        vocabulary_id IN ("ICD9", "ICD10")
      )
  ) AS foo 
UNION 
SELECT 
  * 
FROM 
  concept 
WHERE 
  concept_id IN (
    SELECT 
      concept_id_2 
    FROM 
      concept_relationship cr 
      INNER JOIN foo ON cr.concept_id_1 = foo.concept_id 
    WHERE 
      relationship_id = "Maps to"
  )


But it returns an "INVALID ARGUMENT" error, I'm guessing because I can't pass the subquery foo above below the UNION to the other query. Is there a smoother way to do this? I've included some dummy tables below to attempt reproducibility, which seems to work database-side, but all columns are not included for conciseness.

concept :

| concept_id  | concept_code | vocabulary_id |
| ----------- | ------------ |---------------|
| 123         | 398.4        | ICD9          |
| 111         | I45          | ICD10         |
| 145         | 45155841     | SNOMED        |

concept_relationship:

| concept_id_1 | concept_id_2 | relationship_id|
| -----------  | ------------ |--------------- |
| 123          | 145          | Maps to        |
| 111          | 145          | Maps to        |
| 145          | 145          | Maps to        |
| 145          | 111          | Maps from      |
| 145          | 123          | Maps from      |
| 145          | 145          | Maps from      |
CREATE TABLE `concept` (
  `concept_id` VARCHAR NOT NULL, 
  `concept_code` VARCHAR NOT NULL, 
  `vocabulary_id` VARCHAR NOT NULL, 
  PRIMARY KEY (`concept_id`)
);

INSERT INTO concept (
  concept_id, concept_code, vocabulary_id
) 
VALUES 
  ("123", "388.4", "ICD9"), 
  ("111", "I45", "ICD10"), 
  ("145", "45155841 ", "SNOMED");


CREATE TABLE `concept` (
  `concept_id_1` VARCHAR NOT NULL, 
  `concept_id_2` VARCHAR NOT NULL, 
  `relationship_id` VARCHAR NOT NULL, 
  PRIMARY KEY (`concept_id_1`)
);

INSERT INTO concept_relationship (
  concept_id_1, concept_id_2, relationship_id
) 
VALUES 
  ("123", "145", "Maps to"), 
  ("111", "145", "Maps to"), 
  ("145", "145", "Maps to"), 
  ("145", "111", "Maps from"), 
  ("145", "123", "Maps from"), 
  ("145", "145", "Maps from");

  • CTE or you have to use the query as an inline view twice. CTE is cleaner I suppose the CTE could be a view instead and you could just call the view too... – xQbert Jul 06 '22 at 20:41

1 Answers1

0

The problem is you use foo on the other side of the union and it can't see it. but a CTE can help like this

WITH foo AS (
  SELECT * 
  FROM concept 
  WHERE concept_code IN ("398.4", "I45") AND
        vocabulary_id IN ("ICD9", "ICD10")
)

SELECT *
FROM foo

UNION 

SELECT * 
FROM concept 
WHERE concept_id IN (
    SELECT concept_id_2 
    FROM concept_relationship cr 
    JOIN foo ON cr.concept_id_1 = foo.concept_id 
    WHERE relationship_id = "Maps to"
)

note the 2nd query after the UNION can be re-written as

SELECT * 
FROM concept 
JOIN concept_relationship cr ON cr.concept_id_2 = concept.conecpt_id
                            AND cr.relationship_id = "Maps to"
JOIN foo on cr.concept_id_1 = foo.concept_id

using joins.

Hogan
  • 69,564
  • 10
  • 76
  • 117
  • Thanks! The CTE seems to be working as noted above, I stumbled into WITH and it seems to be working out. Does the syntax of the statement I made seem like it will produce any problems? It seems to run and give the intended result, but the layout is slightly different than yours. – Moss_and_Bones Jul 06 '22 at 20:48
  • @Moss_and_Bones -- it seems fine. I would not agree with the style of naming something "query1" as it is a terrible name. Imagine having to look at this 5 years from now and understand what it is doing. Query1 conveys no information. A better name helps to self document -- eg PriorityItems, NewUsers, actionItems, etc – Hogan Jul 06 '22 at 20:54
  • That's very fair, I'll rename it to a more descriptive name in the script itself to make sure it makes sense! Thank you for your help – Moss_and_Bones Jul 06 '22 at 21:33