2

I'm attempting to create a report for an accommodation service with the following information:

Number of Bookings (Easy, use the COUNT function)
Revenue Amount (Kind of easy).
Number of Room nights. (Rather Hard it seems)

Broken down into each month of the year.

Limitations - I'm currently using PHP/MySQL to create this report. I'm pulling the data out of the booking system 1 month at a time, then using an ETL process to put it into MySQL.
Because of this, I have duplicate records, when a booking splits across the end of the Month. (eg BookingID = 9216 below - This is because for Revenue purposes we need to split the percentage of the revenue into the corresponding month).

The Question.

How do I write some SQL that will: Calculate the number of room nights that was booked into a Property and Group it by the month. Taking into account that if a booking spans across the end of the month, that the room nights that are inside of the same month, as the checkin are counted towards that month, and room nights which the same month as checkout are in the same month as checkout.

At first I used this: DATEDIFF(Checkout, Checkin). But that lead to one month having 48 room nights in a 31 day month. (because a) it counted 1 booking as 11 nights, even through it was split across the 2 months, and b) because it appears twice).

Then once I have the statement I need to integrate it back into my CrossTab SQL for the entire year.

Some resources that I have found, but can't seem to make work (MySql Query- Date Range within a Date Range & php mysql double date range)

Here is a Sample of the Table: (There are ~100,000 rows of similar data).

CREATE TABLE IF NOT EXISTS `bookingdata` (
`idBookingData` int(11) NOT NULL AUTO_INCREMENT,
`PropertyID` int(10) NOT NULL,
`Checkin` date DEFAULT NULL,
`Checkout` date DEFAULT NULL,
`Rent` decimal(10,2) DEFAULT NULL,
`BookingID` int(11) DEFAULT NULL,
PRIMARY KEY (`idBookingData`),
UNIQUE KEY `idBookingData_UNIQUE` (`idBookingData`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=10472 ;

INSERT INTO `bookingdata` (`idBookingData`, `PropertyID`, `Checkin`, `Checkout`, `Rent`, `BookingID`) VALUES
(5148, 2, '2011-07-02', '2011-07-05', 1105.00, 10612),
(5149, 2, '2011-07-05', '2011-07-13', 2155.00, 10184),
(5151, 2, '2011-07-14', '2011-07-17', 1105.00, 11102),
(5153, 2, '2011-07-22', '2011-07-24', 930.00, 14256),
(5154, 2, '2011-07-24', '2011-08-04', 1832.73, 9216),
(5907, 2, '2011-07-24', '2011-08-04', 687.27, 9216),
(5910, 2, '2011-08-11', '2011-08-14', 1140.00, 13633),
(5911, 2, '2011-08-15', '2011-08-16', 380.00, 17770),
(5915, 2, '2011-08-25', '2011-08-29', 1350.00, 17719),
(5916, 2, '2011-08-30', '2011-09-01', 740.00, 16813);
Community
  • 1
  • 1
Charlie
  • 70
  • 6

1 Answers1

3

You're on the right lines. You need to join your query with a table of the months for which you want data, which can either be permanent or (as shown in my example below) created dynamically in a UNION subquery:

SELECT   YEAR(month.d),
         MONTHNAME(month.d),
         SUM(1 + DATEDIFF( -- add 1 because start&finish on same day is still 1 day
           LEAST(Checkout, LAST_DAY(month.d)), GREATEST(Checkin, month.d)
         )) AS days
FROM     bookingdata
  RIGHT JOIN (
                   SELECT 20110101 AS d
         UNION ALL SELECT 20110201 UNION ALL SELECT 20110301
         UNION ALL SELECT 20110401 UNION ALL SELECT 20110501
         UNION ALL SELECT 20110601 UNION ALL SELECT 20110701
         UNION ALL SELECT 20110801 UNION ALL SELECT 20110901
         UNION ALL SELECT 20111001 UNION ALL SELECT 20111101
         UNION ALL SELECT 20111201
  ) AS month ON
             Checkin <= LAST_DAY(month.d)
         AND month.d <= Checkout
GROUP BY month.d

See it on sqlfiddle.

eggyal
  • 122,705
  • 18
  • 212
  • 237
  • The joining condition can be simplified to: `Checkin <= LAST_DAY(month.d) AND month.d <= Checkout` – ypercubeᵀᴹ May 10 '12 at 06:39
  • Thank you so much, that's the closest that I have got all week. Since we have this really annoying doubling of Rows, is it possible that it will use some sort of SELECT DISTINCT on either the BookingID column or the Checkin/CheckOut Column. What is happening is that the Row is being added in twice. So for example in the month of July there is the 11 night booking from 2011-07-24 to 2011-08-04 The first 7 days of the booking in July are being added correctly, but twice. Or should I clean the data in the ETL process? (the Rent column is used for each month). – Charlie May 10 '12 at 07:26
  • @Charlie: It sounds to me as though you should try to work with cleaned data. Alternatively, if every record has been inserted twice, you could merely half the resulting days? FYI, you should [accept answers](http://meta.stackexchange.com/a/5235) if they solve your problem and [upvote](http://stackoverflow.com/privileges/vote-up) them if they're useful or helpful. – eggyal May 10 '12 at 07:42
  • Thanks, it's only the bookings that fall across the month that double up. The data is extracted on a month, by month basis, using a "Statement Report" - It was the only report with the required data from the software. (Propriety database as well). So if an booking falls across a month, the owner is paid only the percentage of the booking that falls inside that month. Then the remainder is paid in the next month. Hence the doubling. I might look at joining the rows together in my ETL process. – Charlie May 11 '12 at 00:07
  • @Charlie: But the above accounts for that by only taking the number of days of those bookings which fall in the given month (using `LEAST` and `GREATEST`). – eggyal May 11 '12 at 05:41