0

I'm currently working on a cakephp 3.0 project to create an individual social network. I'm dealing with a problem for my "timeline". There I have a query which first gets a list of all my friends inside this community and then collecting all the user_activities they made. (see query below).

So at the end I have a list with all my friends activities ordered by created... In the FrontEnd I use a foreach to loop through it and have for every "activity.type" a seperate view template (just a text-post, an image-post or a video-post)

For a number of 50-100 Friends its ok... the timeline needs < 1sec. to display properly ... but the community has users with up to 5000 friends (I tested with ~400 friends which had a query runtime of 4-5 seconds).

Do somebody has an idea how I could improve the query or could I do something else to improve the perfomance (only from SQL or PHP side,... I already set up some indexes for certain columns and thought about a limitation by date, e.x. collecting only activities from - 1 year).

I am not very experienced with this topic, so I would appreciate any ideas :)

-------- EDIT

gather friendlist

SELECT (CASE WHEN `Friends`.`user_id` = '6' THEN `FriendUsers`.`id` ELSE `Friends`.`user_id` END) AS `id` FROM `friends` `Friends` LEFT JOIN `users` `FriendUsers` ON `FriendUsers`.`id` = (`Friends`.`friend_id`) WHERE (`status` >= 0 AND (`Friends`.`user_id` = 6 OR `Friends`.`friend_id` = 6))

the query for collecting the user-activities

SELECT  `Wallposts`.`text` AS `Wallposts__text`, `Activities`.`name` AS `Activities__name`,
        `Users`.`firstname` AS `Users__firstname`, `Users`.`lastname` AS `Users__lastname`,
        `Users`.`username` AS `Users__username`, `Users`.`id` AS `Users__id`,
        `UsersActivities`.`created` AS `UsersActivities__created`,
        `UsersActivities`.`id` AS `UsersActivities__id`, `UsersActivities`.`video_id` AS `UsersActivities__video_id`,
        `UsersActivities`.`picture_id` AS `UsersActivities__picture_id`,
        `UsersActivities`.`visible_for_group` AS `UsersActivities__visible_for_group`,
        `Pictures`.`id` AS `Pictures__id`, `Pictures`.`user_id` AS `Pictures__user_id`,
        `Pictures`.`description` AS `Pictures__description`, `Pictures`.`fileName` AS `Pictures__fileName`,
        `Pictures`.`album_id` AS `Pictures__album_id`, `Pictures`.`isChosen` AS `Pictures__isChosen`,
        `Albums`.`id` AS `Albums__id`, `Albums`.`name` AS `Albums__name`,
        `Videos`.`id` AS `Videos__id`, `Videos`.`hoster` AS `Videos__hoster`,
        `Videos`.`video_key` AS `Videos__video_key`, `Videos`.`name` AS `Videos__name`,
        `Videos`.`description` AS `Videos__description`
    FROM  `users_activities` `UsersActivities`
    left JOIN  `users` `Users`  ON UsersActivities.user_id = Users.id
    left JOIN  `activities` `Activities`  ON UsersActivities.activity_id = Activities.id
    left JOIN  `wallposts` `Wallposts`  ON UsersActivities.wallpost_id = Wallposts.id
    LEFT JOIN  `videos` `Videos`  ON `Videos`.`id` = (`UsersActivities`.`video_id`)
    LEFT JOIN  `pictures` `Pictures`  ON `Pictures`.`id` = (`UsersActivities`.`picture_id`)
    LEFT JOIN  `albums` `Albums`  ON `Albums`.`id` = (`Pictures`.`album_id`)
    WHERE  (`UsersActivities`.`member_id` = 0
              AND  `Users`.`active` not in ('2')
              AND  `UsersActivities`.`user_activity_id` = 0
              AND  (
                    (`UsersActivities`.`user_id` = 390900002
                              AND  `UsersActivities`.`activity_id` not in (5,6)
                    )
                      OR  (`UsersActivities`.`user_id` in 
                                (391407850,391511765,
                                        391511432,491511714,391512398,391204138,391407984,391000522,
                                        391408687,391511708,391305910,391511812,391511681,491512107,
                                        391408047,391408494, -- and hundreds more 
                                )
                              AND  `UsersActivities`.`visible_for_group` in (0,3)
                          )
                   )
           )
    ORDER BY  `UsersActivities`.`created` desc 

enter image description here

EXPLAIN JSON FORMAT

{
  "query_block": {
    "select_id": 1,
    "ordering_operation": {
      "using_temporary_table": true,
      "using_filesort": true,
      "nested_loop": [
        {
          "table": {
            "table_name": "UsersActivities",
            "access_type": "range",
            "possible_keys": [
              "user_id",
              "activity_id",
              "member_id",
              "user_activity_id",
              "visible_for_group_idx"
            ],
            "key": "user_id",
            "used_key_parts": [
              "user_id"
            ],
            "key_length": "8",
            "rows": 25623,
            "filtered": 100,
            "index_condition": "((`project`.`usersactivities`.`user_id` = 390900002) or (`project`.`usersactivities`.`user_id` in ( *[HUNDREDS OF FRIEND IDS]* )))",
            "attached_condition": "((`project`.`usersactivities`.`user_activity_id` = 0) and (`project`.`usersactivities`.`member_id` = 0) and (((`project`.`usersactivities`.`user_id` = 390900002) and (`project`.`usersactivities`.`activity_id` not in (5,6))) or ((`project`.`usersactivities`.`user_id` in ( *[HUNDRED OF FRIEND IDS]* )) and (`project`.`usersactivities`.`visible_for_group` in (0,3)))))"
          }
        },
        {
          "table": {
            "table_name": "Users",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "id"
            ],
            "key_length": "8",
            "ref": [
              "project.UsersActivities.user_id"
            ],
            "rows": 1,
            "filtered": 100,
            "attached_condition": "(`project`.`users`.`active` <> '2')"
          }
        },
        {
          "table": {
            "table_name": "Activities",
            "access_type": "ALL",
            "possible_keys": [
              "PRIMARY"
            ],
            "rows": 6,
            "filtered": 83.333,
            "using_join_buffer": "Block Nested Loop",
            "attached_condition": "<if>(is_not_null_compl(Activities), (`project`.`activities`.`id` = `project`.`usersactivities`.`activity_id`), true)"
          }
        },
        {
          "table": {
            "table_name": "Wallposts",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "id"
            ],
            "key_length": "4",
            "ref": [
              "project.UsersActivities.wallpost_id"
            ],
            "rows": 1,
            "filtered": 100
          }
        },
        {
          "table": {
            "table_name": "Videos",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "id"
            ],
            "key_length": "4",
            "ref": [
              "project.UsersActivities.video_id"
            ],
            "rows": 1,
            "filtered": 100
          }
        },
        {
          "table": {
            "table_name": "Pictures",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "id"
            ],
            "key_length": "4",
            "ref": [
              "project.UsersActivities.picture_id"
            ],
            "rows": 1,
            "filtered": 100
          }
        },
        {
          "table": {
            "table_name": "Albums",
            "access_type": "eq_ref",
            "possible_keys": [
              "PRIMARY"
            ],
            "key": "PRIMARY",
            "used_key_parts": [
              "id"
            ],
            "key_length": "4",
            "ref": [
              "project.Pictures.album_id"
            ],
            "rows": 1,
            "filtered": 100
          }
        }
      ]
    }
  }
}
T_Ix
  • 66
  • 8
  • What have you tried to enhance the performance? What does `EXPLAIN` print? – Nico Haase Feb 18 '20 at 14:35
  • i've deleted some unused tables / left joins to prevent unnecessary loading of information. Further I've setup some indexes in the different tables for columns with id references. I'm not familiar with the explain function, so could you tell me on which points I should focus on? i'm guessing the user-activities table play the main part with >25.000 rows where all other just have 1 up to 6 rows... – T_Ix Feb 18 '20 at 14:56
  • Well, you could start by providing all information, like the output of `EXPLAIN` for one of the slow-running queries – Nico Haase Feb 18 '20 at 15:25
  • edited the post. see the screenshot at the end – T_Ix Feb 18 '20 at 15:36
  • Please don't add text information in images. Additionally, you should share the proper query leading to that `EXPLAIN` output. The given query would throw a syntax error – Nico Haase Feb 18 '20 at 15:41
  • updated the query in my post. – T_Ix Feb 18 '20 at 15:49
  • ....and now, please add the according `EXPLAIN` output again..... – Nico Haase Feb 18 '20 at 15:49
  • sorry, here it is... – T_Ix Feb 18 '20 at 15:52
  • I'm amazed that it used the `user_id` index; I would expect the complexity of the `OR` to prevent such. Please provide `EXPLAIN FORMAT=JSON SELECT ...` and tell us what version of MySQL you are using. – Rick James Feb 18 '20 at 18:53
  • @RickJames here you go. MySQL Version 5.6.38 – T_Ix Feb 19 '20 at 09:22

1 Answers1

2

In the FrontEnd I use a foreach to loop

I stopped reading there. You should be able to make a single SQL query that locates all your friends, JOIN to their activities, and produces the entire list of activities, sorted by date.

Going back and forth to the databases is costly. Gathering, digesting, sorting, etc. all the desired data is usually faster when done in the database server and done in a single query.

For more help, we need to see both the "find friends" query and the "gather activities" query so we can help you put them together.

Some notes on the query you provided:

  • OR prevents the use of indexes, at least for that part.
  • NOT IN -- ditto.
  • Don't use LEFT unless you are expecting the 'right' table to be messy. It makes it harder for us to understand what is going on, and to figure out what optimizations are / are not available.
  • Will there be a LIMIT applied? It seems like this might produce a huge amount of data. Will there be pagination? Is CakePHP gathering a zillion rows, then parceling out a few? Bad news.

Suggested index:

UsersActivities:  (user_activity_id, member_id, created)

If you add that index, please provide the EXPLAIN so we can compare it to the current EXPLAIN.

(It looks like most JOINs are on id, which I assume is the PRIMARY KEY of their respective tables.)

Another possible improvement: Currently you have

SELECT ...
    `Activities`.`name` AS `Activities__name`,
    ...
  left JOIN  `activities` `Activities`
         ON UsersActivities.activity_id = Activities.id
    ...

and that seems to be the only reference to Activities. It seems that it is a 1:many mapping, so perhaps would be an improvement, both in speed and user-friendliness:

SELECT ...
    ( SELECT GROUP_CONCAT(DISTINCT `name`) FROM activities
        WHERE id = UsersActivities.activity_id
    )  AS `Activities__name`,
    ...
  -- And leave out left JOIN  `activities` 
    ...
Rick James
  • 135,179
  • 13
  • 127
  • 222
  • thank you, I'll try your suggestions. I've updated my post with the query for getting all the friend ids. So I'll prevent using `OR` and `NOT IN`... and yes, there is a limit and page applied for the view. currently there is no paginator... for usability reasons implemented an ajax call which executes the query again, when the user hits the bottom of the page :/ – T_Ix Feb 19 '20 at 09:41
  • @T_Ix - "executes the query again" -- (1) the time taken to execute the query, (2) the time taken to send far more data than will actually be used. Which of those two performance issues should we focus on first? – Rick James Feb 19 '20 at 16:01
  • I quite dont understand why the conditions NOT IN and OR would prevent using indexes. I wrote a new query and reduced the WHERE conditions. Also I've setup new Indexes as you suggested and currently its working fine ... I made some tests with active friends (~5000 Friends) and the query needed just 100-200 ms for execution. Thanks for the hints. – T_Ix Feb 21 '20 at 10:53
  • @T_Ix - I don't have a simple explanation for `NOT IN` and `OR`, but here are a couple of things to read: https://stackoverflow.com/a/60344360/1766831 , http://mysql.rjweb.org/doc.php/index_cookbook_mysql – Rick James Feb 21 '20 at 18:50