1

I have two tables like these:

  1. files table
id customer bucket path size createdAt
4043973 40 foo account-40/upload/26f36a10-4aeb-11ed-8a5a-49f4b61eae34_result.png 2238 2022-10-13 11:35:35
4043972 40 foo account-40/upload/26e6bfe0-4aeb-11ed-8e31-571f8b74e4db_result.png 1985 2022-10-13 11:35:35
4043971 40 foo account-40/upload/26dfe210-4aeb-11ed-8e31-571f8b74e4db_result.png 1985 2022-10-13 11:35:35
  1. summary table
id fileId month year downloads
1 585086 9 2022 1
2 584821 9 2022 1
3 584872 9 2022 1

There are action logs of the files processed in the system. These logs remain uncategorized. Each uploaded file is added to the files table.

Downloads made on the files should be saved to the summary table every day. If the file is processed for the first time in that month, a new row should be added. Operations after the first should increase the downloads column by +1.

I can do this with 3 queries. Like this:

Checking whether the file exists:

SELECT 
    id 
FROM 
    summary
WHERE
    fileId=(SELECT id FROM cloud_storage_files WHERE path=':filePath') AND
    month=:month AND
    year=:year

Insert if exists:

INSERT INTO 
    summary (fileId, month, year, downloads)
SELECT id, :month, :year, 1 FROM files WHERE path=':filePath'

Update if not:

UPDATE
    summary,
    files
SET
    summary.downloads = summary.downloads + 1
WHERE
    files.path=':filePath' AND files.id=summary.fileId

But since many files are processed daily, there is a lot of data and I want to do all operations in a single sql query.

So far I tried to do it like this:

IF(SELECT fileId FROM summary 
    WHERE fileId = :fileId
    AND `month` = :month
    AND `year` = :year) 

THEN    
    INSERT INTO 
    summary (fileId, month, year, downloads) VALUES (:fileId, :month, :year, 1)
ELSE
    UPDATE
        summary,
        files
    SET
        summary.downloads = summary.downloads + 1
    WHERE
        files.path='foo-bar-bar' AND files.id=summary.fileId

But no dice yet. I would love to hear from anyone who has a solution or suggestion to the problem.

Thanks!

Murat Colyaran
  • 2,075
  • 2
  • 8
  • 27

1 Answers1

2

Create unique index by file ID and year-month:

CREATE INDEX idx_uni_fileId_month_year
ON summary (fileId, month, year);

then insert with duplicate auto-check:

INSERT INTO summary (fileId, month, year, downloads)
SELECT id, :month, :year, 1 
FROM files 
WHERE path = ':filePath' 
-- or WHERE path IN ( {filepaths list} )
ON DUPLICATE KEY 
UPDATE downloads = downloads + 1;

If a rows with the same (fileId, month, year) already exists then it will be updated and downloads amount will be incremented, otherwise new row will be inserted.

Of course, SELECT may return more than one row, and each row will be processed independently.

Akina
  • 39,301
  • 5
  • 14
  • 25