4

I have a very bad performance in most of my queries. I've read a lot on stackoverflow, but still have some questions, maybe anyone could help or give me any hints?

Basically, i am working on a booking website, having among others the following tables:

objects

+----+---------+--------+---------+------------+-------------+----------+----------+-------------+------------+-------+-------------+------+-----------+----------+-----+-----+
| id | user_id | status | type_id | privacy_id | location_id | address1 | address2 | object_name | short_name | price | currency_id | size | no_people | min_stay | lat | lng |
+----+---------+--------+---------+------------+-------------+----------+----------+-------------+------------+-------+-------------+------+-----------+----------+-----+-----+

OR in MySQL:

CREATE TABLE IF NOT EXISTS `objects` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT COMMENT 'object_id',
  `user_id` int(11) unsigned DEFAULT NULL,
  `status` tinyint(2) unsigned NOT NULL,
  `type_id` tinyint(3) unsigned DEFAULT NULL COMMENT 'type of object, from object_type id',
  `privacy_id` tinyint(11) unsigned NOT NULL COMMENT 'id from privacy',
  `location_id` int(11) unsigned DEFAULT NULL,
  `address1` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
  `address2` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
  `object_name` varchar(35) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'given name by user',
  `short_name` varchar(12) COLLATE utf8_unicode_ci DEFAULT NULL COMMENT 'short name, selected by user',
  `price` int(6) unsigned DEFAULT NULL,
  `currency_id` tinyint(3) unsigned DEFAULT NULL,
  `size` int(4) unsigned DEFAULT NULL COMMENT 'size rounded and in m2',
  `no_people` tinyint(3) unsigned DEFAULT NULL COMMENT 'number of people',
  `min_stay` tinyint(2) unsigned DEFAULT NULL COMMENT '0=no min stay;else # nights',
  `lat` varchar(32) COLLATE utf8_unicode_ci DEFAULT NULL,
  `lng` varchar(32) COLLATE utf8_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1451046 ;


reservations

+----+------------+-----------+-----------+---------+--------+
| id | by_user_id | object_id | from_date | to_date | status |
+----+------------+-----------+-----------+---------+--------+

OR in MySQL:

CREATE TABLE IF NOT EXISTS `reservations` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `by_user_id` int(11) NOT NULL COMMENT 'user_id of guest',
  `object_id` int(11) NOT NULL COMMENT 'id of object',
  `from_date` date NOT NULL COMMENT 'start date of reservation',
  `to_date` date NOT NULL COMMENT 'end date of reservation',
  `status` int(1) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=890729 ;


There are a few questions:

1 - I have not set any additional key (except primary) - where should I set and which key should I set?

2 - I have read about MyISAM vs InnoDB, the conclusion for me was that MyISAM is faster when it comes to read-only, whereas InnoDB is designed for tables that get UPDATED or INSERTs more frequently. So, currently objects uses MyISAM and reservations InnoDB. Is this a good idea to mix? Is there a better choice?

3 - I need to query those objects that are available in a certain period (between from_date and end_date). I have read (among others) this post on stackoverflow: MySQL select rows where date not between date

However, when I use the suggested solution the query times out before returning any results (so it is really slow):

SELECT DISTINCT o.id FROM objects o LEFT JOIN reservations r ON(r.object_id=o.id) WHERE

COALESCE('2012-04-05' NOT BETWEEN r.from_date AND r.to_date, TRUE)
AND COALESCE('2012-04-08' NOT BETWEEN r.from_date AND r.to_date, TRUE)
AND o.location_id=201

LIMIT 20

What am I doing wrong? What is the best solution for doing such a query? How do other sites do it? Is my database structure not the best for this or is it only the query?

I would have some more questions, but I would be really grateful for getting any help on this! Thank you very much in advance for any hint or suggestion!

Community
  • 1
  • 1
Chris
  • 3,756
  • 7
  • 35
  • 54

3 Answers3

6

It appears you are looking for any "objects" that do NOT have a reservation conflict based on the from/to dates provided. Doing a coalesce() to always include those that are not ever found in reservations is an ok choice, however, being a left-join, I would try left joining where the IS a date found, and ignoring any objects FOUND. Something like

SELECT DISTINCT 
      o.id 
   FROM 
      objects o 
         LEFT JOIN reservations r 
            ON o.id = r.object_id
           AND (  r.from_date between '2012-04-05' and '2012-04-08'
               OR r.to_date between '2012-04-05' and '2012-04-08' )
   WHERE
          o.location_id = 201
      AND r.object_id IS NULL
  LIMIT 20

I would ensure an index on the reservations table by (object_id, from_date ) and another (object_id, to_date). By explicitly using the from_date between range, (and to date also), you are specifically looking FOR a reservation occupying this time period. If they ARE found, then don't allow, hence the WHERE clause looking for "r.object_id IS NULL" (ie: nothing is found in conflict within the date range you've provided)

Expanding from my previous answer, and by having two distinct indexes on (id, from date) and (id, to date), you MIGHT get better performance by joining on reservations for each index respectively and expecting NULL in BOTH reservation sets...

SELECT DISTINCT 
      o.id 
   FROM 
      objects o 
         LEFT JOIN reservations r 
            ON o.id = r.object_id
           AND r.from_date between '2012-04-05' and '2012-04-08'
         LEFT JOIN reservations r2 
            ON o.id = r2.object_id
           AND r2.to_date between '2012-04-05' and '2012-04-08'
   WHERE
          o.location_id = 201
      AND r.object_id IS NULL
      AND r2.object_id IS NULL
  LIMIT 20
DRapp
  • 47,638
  • 12
  • 72
  • 142
  • wow, thank you very much DRapp! I will test both queries, and let you know asap. Thanks very much for your help! – Chris Apr 26 '12 at 22:47
  • Hi DRapp! Thank you again for your genius queries, this the awesome result: they are both immense fast (depending on the query, but the average is about 0.00x sec!) this is gigantic! the first query is slightly faster, but the difference is very subtle! Thx again for your great help! – Chris Apr 27 '12 at 23:57
  • You are welcome... I love helping solve querying problems and have quite a history of VERY COMPLEX queries throughout my answer history :) – DRapp Apr 28 '12 at 00:01
  • amazing, and you are not even specialized in this field only, as your profile says you are actually focusing on c#, foxprox, ... :) – Chris Apr 28 '12 at 00:14
2

I wouldn't mix InnoDB and MyISAM tables, but I would define all the tables as InnoDB (for foreing keys support). Generally all the columns with the _id suffix should be foreign keys refering to appropriate table (object_id => objects etc).

You don't have to define index on foreign key as it is defined automatically (since MySQL 4.1.2), but you can define additional indexes on reservations.from_date and reservations.to_date columns for faster comparison.

jpesout
  • 688
  • 5
  • 12
  • hey, thank you very much for this! :) do you know how to solve the query issue (if date between...) or do you think your optimization suggestions will solve this problem? – Chris Apr 26 '12 at 11:42
  • You are welcome. I use regular comparison for this, but it should have the same effect as your code, I guess. Indexes will definitely help you. – jpesout Apr 26 '12 at 12:13
  • Btw, be careful about implicit string to date conversions, as it can work differently on various MySQL versions. I had a lot of troubles with this after MySQL upgrade, so I get used to convert to dates explicitly (such as DATE('2012-04-08')). – jpesout Apr 26 '12 at 12:23
  • oh, thank you! the tip with DATE('2012-04-08') is certainly a very great one! I'm already converting the tables from MyISAM to InnoDB to set the indexes, I'll let you know how it worked :) – Chris Apr 26 '12 at 12:26
  • actually, I just read on another post, that any function used on a field with an index would make that index useless. In oder words, if you use DATE() on the column field which has an index assigned, the index would not be used! just as an information... – Chris Apr 26 '12 at 22:30
  • and, another information I found out: foreign keys do not really boost performance. Their main goal is redundancy, but not performance. They more likely lead to errors, so there are some suggestions to not use foreign keys at all. – Chris Apr 26 '12 at 22:31
  • PS: as to not use foreign keys, see e.g. http://www.mysqlperformanceblog.com/2012/01/02/hijacking-innodb-foreign-keys/) – Chris Apr 26 '12 at 22:46
  • Yes, the main concern of foreign keys is to keep referential integrity. There are some good reasons to have them and some for not to, I don't use them everywhere neither. Indexes are for optimalisation, my point was if you define FK, than you don't have to define index on this column, since it is added automatically. http://programmers.stackexchange.com/questions/45447/foreign-key-restrictions-yes-or-no – jpesout Apr 28 '12 at 05:20
1

I know this is a year old, but if you've tried that solution above, the logic isn't complete. It misses reservations that start before the query start AND end after the query end. Also between doesn't cope with reservations that start and end at the same time.

This worked better for me:

SELECT venues.id
FROM venues LEFT JOIN reservations r
       ON venues.id = r.venue_id && (r.date_end >':start' and  r.date_start <':end')
WHERE r.venue_id IS NULL
ORDER BY venues.id
Leigh
  • 28,765
  • 10
  • 55
  • 103
zoneblue
  • 124
  • 7