-2

Q. Create a new column DaysTakenForDelivery that contains the date difference between Order_Date and Ship_Date.

Tables available are: orders and shipping

CREATE TABLE orders (
    Order_ID int DEFAULT NULL,
    Order_Date text,
    Order_Priority text,
    Ord_id text
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

CREATE TABLE shipping (
    Order_ID int DEFAULT NULL,
    Ship_Mode text,
    Ship_Date text,
    Ship_id text,
    DaysTakenForDelivery` int DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

Row counts:

  • shipping: 7701
  • orders: 5506

Please note I changed the datatyes for date columns properly.

Things I did:

I added the required column in the table 'shipping', since nothing was specified with respect to that (which table it should be added to or otherwise). Query for the same:

ALTER TABLE shipping ADD DaysTakenForDelivery INT;

Next, I tried to update the column using various queries but nothing worked. A few of them are listed below:

FAILED ATTEMPTS:

UPDATE shipping SET DaysTakenForDelivery = (
    select datediff(b.ship_date, a.order_date) AS DaysTakenForDelivery 
    from orders a
    JOIN shipping b ON a.Order_ID = b.Order_ID
);

NOTE: this query led to the following error:

Error Code: 1093. You can't specify target table 'shipping_dimen' for update in FROM clause

Next query I tried:

UPDATE shipping b SET DaysTakenForDelivery = (
    select datediff(b.ship_date, a.order_date) AS DaysTakenForDelivery 
    from orders a  
    WHERE a.Order_ID = b.Order_ID
);

NOTE: this query led to the following error:

Error Code: 1242. Subquery returns more than 1 row

How am I supposed to achieve the desired result?

Please note I am using MySQL and answers for the same RDBMS would be appreciated for better understanding.

Version I am using: 8.0.31

user1191247
  • 10,808
  • 2
  • 22
  • 32
RUBINA
  • 23
  • 5
  • Please ask 1 specific researched non-duplicate question. Please either ask re 1 bad query/function with obligatory [mre], including why you think it should return something else or are unsure at the 1st subexpression that it doesn't give what you expect or are stuck, justified by reference to authoritative documentation, or ask about your overall goal giving working parts you can do with justification & ideally a [mre]. Then misunderstood code doesn't belong. But please ask about unexpected behaviour 1st because misconceptions get in the way of your goal. [ask] [Help] Basic questions are faqs. – philipxy Mar 01 '23 at 08:32
  • Please look at the formatted version of your post before you publish. See edit help re block & inline formats for code & quotes, etc. – philipxy Mar 01 '23 at 08:34
  • 1
    _"Please note I changed the datatyes for date columns properly"_ - there is nothing proper about storing `DATE`s in `TEXT` fields! They should be in one of the [supported DATE/TIME formats](https://dev.mysql.com/doc/refman/8.0/en/datetime.html). You have not defined a PRIMARY KEY for either table and there is no index supporting the `Order_ID` foreign key on the `shipping` table. – user1191247 Mar 01 '23 at 20:37

1 Answers1

0

It would be better to not store the redundant data, as there is always the risk that it becomes inconsistent, and it is just unnecessary use of storage.

It is just a normal multi-table update:

UPDATE shipping s
JOIN orders o ON s.Order_ID = o.Order_ID
SET s.DaysTakenForDelivery = DATEDIFF(s.Ship_Date, o.Order_Date);

As you appear to be having performance issues while trying to update the table, you could try updating in batches:

UPDATE shipping s
JOIN (
    SELECT s.ship_id, DATEDIFF(s.ship_date, o.order_date) AS diff
    FROM orders o
    JOIN shipping s ON o.order_id = s.order_id
    WHERE s.DaysTakenForDelivery IS NULL
    ORDER BY o.order_id, s. ship_id
    LIMIT 1000 -- batch size
) sd ON s.ship_id = sd.ship_id
SET s.DaysTakenForDelivery = sd.diff;

If that still does not work you can try reducing the batch size further.


Now you have added the DDL for your tables, we can see where some of your issues are coming from. The lack of a keyed relationship between the two tables is an issue and the number of rows returned by the join tells us that the Order_ID is not unique in your orders table. You need to deal with the duplicate Order_IDs before you can move forward.

To find the duplicates you can use the following queries:

SELECT *
FROM (
    SELECT *, COUNT(*) OVER (PARTITION BY Order_ID) AS num
    FROM orders
) t
WHERE num > 1;

/* And assuming Ship_id is intended to be the PK (unique identifier) */
SELECT *
FROM (
    SELECT *, COUNT(*) OVER (PARTITION BY Ship_id) AS num
    FROM shipping
) t
WHERE num > 1;

After you have dealt with the duplicates, and making sure the two date fields contain valid date strings (yyyy-mm-dd), you can run something like the following to add the primary keys, the foreign key for Order_ID in shipping, and change the datatypes of the DATE columns:

ALTER TABLE orders
    MODIFY COLUMN Order_ID INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
    MODIFY COLUMN Order_Date DATE NOT NULL;

ALTER TABLE shipping
    CHANGE COLUMN Ship_id Ship_ID INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY FIRST,
    MODIFY COLUMN Ship_Date DATE,
    MODIFY COLUMN Order_ID INT UNSIGNED NOT NULL,
    ADD FOREIGN KEY (Order_ID) REFERENCES orders (Order_ID);

You will still need to address the other datatype issues but without seeing sample data it is impossible to say what the other TEXT columns should be changed to.

You might want to do some reading about:

  1. Normalization
  2. Data Types
user1191247
  • 10,808
  • 2
  • 22
  • 32
  • I have tried this query also. But it generates the following error: Error Code: 2013. Lost connection to MySQL server during query. There are 7701 rows in the shipping table. – RUBINA Mar 01 '23 at 18:17
  • The values are from `COUNT(*)`. SELECT QUERY: returns 7725 rows in 0.016sec/0.015sec. `select datediff(b.ship_date, a.order_date) AS DaysTakenForDelivery from orders a JOIN shipping b ON a.order_id = b.order_id;` – RUBINA Mar 01 '23 at 19:59
  • Please add the first 5 `orders` and the associated `shipping` records as sample data to your question. Then we will deal with sorting out your table structure, which is lacking keys. – user1191247 Mar 01 '23 at 20:05