0

Every user has a column and it will show how many people viewed his profile.

I don't need to show how many people viewed the profile only, I also want to sort views according to time.

I did it like this but if I reach 50 million and above, I'll suffer from the query because it will be slow and I may get unexpected problems.

Views table

User_Id Viewed_Date
3 ...
2 ...
2 ...
3 ...
2 ...
2 ...
3 ...
1 ...

I tried to try another way and so I deleted the Views table and replaced it with the Users column

Users table

Id Name Views
1 User1 { "Last 1 day": 0, "Last 7 days": 0, "Last 30 days": 0, "Last 365 days": 0, "All time": 0 }
2 User2 { "Last 1 day": 25, "Last 7 days": 0, "Last 30 days": 0, "Last 365 days": 0, "All time": 25 }
3 User3 { "Last 1 day": 31, "Last 7 days": 162, "Last 30 days": 0, "Last 365 days": 0, "All time": 193 }

The JSON looks like this

{
  "Last 1 day": 0,
  "Last 7 days": 0,
  "Last 30 days": 0,
  "Last 365 days": 0,
  "All time": 0
}

I want to ask what is the best way to update views inside JSON, Any ideas, please.

(from Comment) (formerly called VIEWS)

CREATE TABLE VISITS (
    USER_ID int(11) NOT NULL, 
    VISITED_IN datetime NOT NULL
           DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
Rick James
  • 135,179
  • 13
  • 127
  • 222
Taha Sami
  • 1,565
  • 1
  • 16
  • 43
  • 1
    Generally speaking, holding multiple items of information in a single column (within a relational database) is a very bad idea - read up about normalisation, specifically the 3rd Normal Form if you want to know why. Presumably you have some sort of query that provides the information necessary to populate this table? If you do, why can't this query be run when you need the information - why do you need to pre-calculate it and store it for every user? – NickW May 25 '22 at 12:43
  • @NickW My second method looks stupid, I prefer the first method but I tried to put 17 million rows inside the Views table, These rows were randomly distributed between 25 users. User number 5 has 756,821 rows. I tried to get the views number of user number 5 but the query took approximately between 6-8 seconds. After this, I put a new user (User number 26) inside the Users table and I gave him 3 views only inside the Views table for the test. The result was the same thing, The query took approximately between 6-8 seconds to get 3 views only for user number 26. Did I miss anything? – Taha Sami May 25 '22 at 13:11
  • Are you saying that querying the source for the views table is taking 6-8 seconds for a single user - or is this for querying the Views table? Can you provide the query/explain plan for the query you are running? If this query is against source tables can you provide the DDL for the table(s) being queried? – NickW May 25 '22 at 13:45
  • @NickW Yes, It is taking 6-8 seconds for a single user even if he has 3 rows only. – Taha Sami May 25 '22 at 13:50
  • Hi - you haven't answered my question about what you were querying and you haven't provided any of the additional information I asked for. – NickW May 25 '22 at 18:02
  • @NickW Could you look at [my question](https://stackoverflow.com/questions/72389408/why-do-the-queries-take-the-same-time-to-get-data-even-though-the-number-of-rows) for more details. Specifically the suggested scenario. – Taha Sami May 26 '22 at 16:28
  • 1
    6-8 seconds for user 26 -- sounds like the table needs an index. Please provide `SHOW CREATE TABLE`. I can probably help a lot with 26 (but not much with 25). – Rick James May 26 '22 at 17:06
  • @RickJames I changed table name from views to visits -> CREATE TABLE `VISITS` ( `USER_ID` int(11) NOT NULL, `VISITED_IN` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci – Taha Sami May 26 '22 at 17:21
  • 1
    That needs `INDEX(user_id, visited_in)` -- I assume there could be 2 visits for one user in one second. – Rick James May 26 '22 at 17:54
  • @RickJames Could you be explain more what should I do? – Taha Sami May 26 '22 at 18:03
  • 1
    What to do... That was my answer. (A slight difference: the uniqueness of the DATATIME.) – Rick James May 26 '22 at 18:55

1 Answers1

1

(You have been chastised in the Comments for the JSON approach; I won't repeat.) "50M rows" may not be an issue.

-- If you need to keep the exact time of every view, have this:
CREATE TABLE view_details (
    user_id MEDIUMINT UNSIGNED ...,  -- pick suitable INT size
    view_date DATETIME ...,
    PRIMARY KEY(user_id, view_date),
    INDEX(view_date, user_id)
    ) ENGINE=InnoDB;

-- This is a "Summary table" of views by day by user:
CREATE TABLE daily_views (
    user_id MEDIUMINT UNSIGNED ...,
    view_day DATE ...,
    ct SMALLINT UNSIGNED ...,  -- pick suitable INT size
    PRIMARY KEY(user_id, view_day),
    INDEX(view_day, user_id)
    ) ENGINE=InnoDB;

When you insert into view_details, also do "IODKU" ("upsert")

INSERT INTO daily_views (user_id, view_day, ct)
    VALUES ($user_id, CURDATE(), 1)
    ON DUPLICATE KEY UPDATE
        ct = VALUES(ct) + 1;
-- In MySQL 8.0,  ct = OLD.ct + 1

That gives you the daily counts for each user. For the weekly/monthly/etc counts.... Eg, for the week ending yesterday:

SELECT SUM(ct)
    FROM daily_views
    WHERE user_id = $user_id
      AND view_day >= CURDATE() - INTERVAL 7 DAY
      AND view_day  < CURDATE();

(There are many variations on that depending on which users you want to report on and what date range is desired.)

Resist the temptation to have weekly, monthly, etc, summary tables. Usually having just one summary table is sufficient, and fast "enough", even for "all time". If you find it to be not fast enough, please provide some more statistics (typical view/day, number users, total view count, etc); we can discuss alternatives.

If you expect hundreds of views per second, we may need to tweak things some. Meanwhile, what I showed here should scale well for most applications.

More discussion: Summary Tables

But, what about JSON?

What needs to happen every midnight? Every number in the JSON for every user needs to be recomputed. (Oh, except for "all time" -- that won't change just because the clock ticked.) Sure, there are a set of SELECT ... GROUP BYs that will compute all the data. Then you have to go to each user and update it.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Could you look at [my question](https://stackoverflow.com/questions/72389408/why-do-the-queries-take-the-same-time-to-get-data-even-though-the-number-of-rows) for more details. Specifically the suggested scenario. – Taha Sami May 26 '22 at 16:28
  • 1
    Sorry, I failed to point out how cumbersome the necessary recomputation of every JSON is every midnight. I added that. Sure, my answer is less cumbersome, but the effort is spread out, making it "better". – Rick James May 26 '22 at 17:03