0

I am new to SQL and the extraction of data from databases, so please bear with me. I only have experience with coding in statistical programs, including Stata, SAS, and R. I currently have a SELECT clause that extracts a table from an Oracle database. To simplify the question, I make use of a illustrative example:

I am interested in CREATING a new variable, which is not included in the database and must be defined based on the other variables, that contains the weight of their mother. Since I am new to SQL, I do not know if this is possible to do in the SELECT clause or if there exist more efficient options

Note that,

  1. Mother and Mother_number are referring to the "same numbers", meaning that mothers and daughters are represented in the model.

  2. AA (number 1) and CC (number 3) have the same mother (BB) (number 2)

  3. I need to do some conversion of the date, e.g. to_char(a.from_date, 'dd-mm-yyyy') as fromdat since SQL confuses the year with the day-of-the month

The SQL code:

 select to_char(a.from_date, 'dd-mm-yyyy') as fromdate, a.Name, a.Weight, a.Number, a.Mother_number 
    from table1 a,  table2 b
    where 1=1
    and  a.family_ref=b.family_ref 
    and  .. (other conditions)

What I currently obtain:

| fromdate   | Name | Weight | Number | Mother_number |
|------------|------|--------|--------|---------------|
| 06-07-2021 | AA   | 100    | 1      | 2             |
| 06-07-2021 | BB   | 200    | 2      | 3             |
| 06-07-2021 | CC   | 300    | 3      | 2             |
| 06-07-2021 | DD   | 400    | 4      | 5             |
| 06-07-2021 | EE   | 500    | 5      | 6             |
| ...        | ...  | ...    | ...    | ...           |

What I wish to obtain:

| fromdate   | Name | Weight | Number | Mother_number | Mother_weight |
|------------|------|--------|--------|---------------|---------------|
| 06-07-2021 | AA   | 100    | 1      | 2             | 200           |
| 06-07-2021 | BB   | 200    | 2      | 3             | 300           |
| 06-07-2021 | CC   | 300    | 3      | 2             | 200           |
| 06-07-2021 | DD   | 400    | 4      | 5             | 500           |
| 06-07-2021 | EE   | 500    | 5      | 6             | …             |
|            | …    | …      | …      | …             | …             
MT0
  • 143,790
  • 11
  • 59
  • 117
Cecilie S. K
  • 49
  • 2
  • 8
  • Your data doesn't have a `mother` column which makes the explanation not fit the data. – Gordon Linoff Jul 06 '21 at 11:18
  • So you need to do a self join to find the weight for the id value listed in the mother_number variable. – Tom Jul 06 '21 at 12:43
  • As t the wording: You want to select an additional *column* showing the mother's weight. As to the syntax: You are using a join syntax that got out of fashion in the early 1990s (Oracle was some yers late to adopt it, though). Please use proper explicit joins: `from table1 a inner join table2 b on b.family_ref = a.family_ref`. As to table2: You are not selecting from it, so why the join? – Thorsten Kettner Jul 06 '21 at 12:57
  • As to "SQL confuses the year with the day-of-the month": No, it doesn't. If `select from_date` results in a format you don't like, then this is probably due to an undesired setting. See https://stackoverflow.com/questions/8134493/how-can-i-set-a-custom-date-time-format-in-oracle-sql-developer – Thorsten Kettner Jul 06 '21 at 13:08

3 Answers3

1

Assuming the MOTHER_NUMBER value is referencing the same value as the NUMBER variable just join the table with itself.

select a.fromdate 
     , a.name 
     , a.weight 
     , a.number 
     , a.mother_number
     , b.weight as mother_weight
from HAVE a
left join HAVE b
on a.mother_number = b.number
Tom
  • 47,574
  • 2
  • 16
  • 29
  • I'd even alias the tables `daughter` and `mother` to enhance the query's readability. `a` and `b` are poor choices for table aliases (unless the table names are, say, **a**rticle and **b**ranch). – Thorsten Kettner Jul 06 '21 at 13:00
0

Although I'm not sure I'm following the "mother" logic, the way you need to implement the last column in your SELECT statement is to add b.weight as Mother_Weight in the end of the first line, before the for keyword.

Since the b table references "Mothers", you can add the column just by taking the weight of the person in table b.


If instead you wish to add the data of a person's mother's weight, you can do that by adding a column to the relevant table and then updating each row in your table by executing the statements below:

ALTER TABLE table1 ADD Mother_weight FLOAT;
UPDATE table1 SET Mother_weight=(SELECT (Weight) FROM table2 WHERE table1.family_ref=table2.familyref);

Then you add the a.Mother_weight clause in your SELECT statement.

  • I am sorry, but I do not think I understand your answer. I want to create a variable, Mother_weight, which is not contained in any table, but must be defined based on the information about the number of the mother (Mother_number) and the number each person has. – Cecilie S. K Jul 06 '21 at 10:26
  • So, is the weight of the mother a new piece of information to be stored in the table, or is it information based on the relations between each person? – Themistoklis Gkasios Jul 06 '21 at 10:31
  • Hi. It is a new variable that should be generated using the information 1) what is the number of the mother, X, and 2) what weight does X have. I want this information to be stored as a column/variable, "Mother_weight", in the table, yes. I hope, I understood your question correctly. If not please do not hesitate to comment again. I appreciate your help. :-) – Cecilie S. K Jul 06 '21 at 10:43
  • Thing is, in SQL you don't usually work with "variables", unless it is singular data, because from the nature of your requirement, you imply that each person that has a mother must also have a `Mother_weight` related to them. When you work with many rows, you either store data by adding it to the table as a row, or, if it is stored relatively somewhere else, you access it via a `SELECT` statement, as you can see in the solution above. – Themistoklis Gkasios Jul 06 '21 at 11:08
0

Use a hierarchical query:

SELECT to_char(a.fromdate, 'dd-mm-yyyy') as fromdate,
       a.Name,
       a.Weight,
       a."NUMBER",
       a.Mother_number,
       PRIOR weight AS mother_weight
FROM   table1 a
       INNER JOIN table2 b
       ON (a.family_ref=b.family_ref)
WHERE  LEVEL = 2
OR     (   LEVEL = 1
       AND NOT EXISTS(
             SELECT 1
             FROM   table1 x
             WHERE a.mother_number = x."NUMBER"
           )
       )
CONNECT BY NOCYCLE
       PRIOR "NUMBER" = mother_number
AND    PRIOR a.family_ref = a.family_ref
ORDER BY a."NUMBER"

Or, a sub-query factoring clause and a self-join:

WITH data (fromdate, name, weight, "NUMBER", mother_number) AS (
  SELECT to_char(a.fromdate, 'dd-mm-yyyy'),
         a.Name,
         a.Weight,
         a."NUMBER",
         a.Mother_number
  FROM   table1 a
         INNER JOIN table2 b
         ON (a.family_ref=b.family_ref)
)
SELECT d.*,
       m.weight AS mother_weight
FROM   data d
       LEFT OUTER JOIN data m
       ON (d.mother_number = m."NUMBER")
ORDER BY d."NUMBER"

Which, for the sample data:

CREATE TABLE table1 (family_ref, fromdate, Name, Weight, "NUMBER", Mother_number) AS
SELECT 1, DATE '2021-07-06', 'AA', 100, 1, 2 FROM DUAL UNION ALL
SELECT 1, DATE '2021-07-06', 'BB', 200, 2, 3 FROM DUAL UNION ALL
SELECT 1, DATE '2021-07-06', 'CC', 300, 3, 2 FROM DUAL UNION ALL
SELECT 1, DATE '2021-07-06', 'DD', 400, 4, 5 FROM DUAL UNION ALL
SELECT 1, DATE '2021-07-06', 'EE', 500, 5, 6 FROM DUAL;

CREATE TABLE table2 (family_ref) AS
SELECT 1 FROM DUAL;

Both output:

FROMDATE NAME WEIGHT NUMBER MOTHER_NUMBER MOTHER_WEIGHT
06-07-2021 AA 100 1 2 200
06-07-2021 BB 200 2 3 300
06-07-2021 CC 300 3 2 200
06-07-2021 DD 400 4 5 500
06-07-2021 EE 500 5 6

db<>fiddle here

MT0
  • 143,790
  • 11
  • 59
  • 117