0

I'm currently taking the online class on databases, If you could help me solve this sql problem I would greatly appreciate it. Sorry I'm a complete noob.

I need to list out all the drivers’ name, age, position and their average rating for a particular month. Listing out name,age and position is ok but i have problem with the average rating(as there are few entries for a 'booking',need to calculate average from those entries) and also group them into each driver from a particular month

below are my codes

  table booking
  BookID NUMBER(4) PRIMARY KEY,
  PersonID NUMBER(4),
  driverID NUMBER(4),    
  PickLoc VARCHAR2(13),
  DropLoc VARCHAR2(13),
  TDate DATE,
  TTime NUMBER(4),
  RideFare CHAR(6),
  TollOther CHAR(6),
  cancelDate DATE,
  TripRating NUMBER(1), 
  PayMethod CHAR(15),



  table driver
  driverID NUMBER(4) PRIMARY KEY, 
  Dname VARCHAR2(20),
  DNRIC CHAR(14),
  Dgender CHAR(1),
  DDOB DATE,
  Dmobile CHAR(11),
  DcarNo CHAR(6),
  Dstart DATE,
  Dgrade CHAR(6),
  DLicence CHAR(4),
  );
Shu Yi
  • 11
  • 3
  • 3
    Some effort please, https://stackoverflow.com/help/minimal-reproducible-example – jarlh Jun 14 '19 at 07:30
  • 2
    Are you looking for answer here, then probably not the right place. Please put what you tried and where are you facing issue. – Raka Jun 14 '19 at 07:40
  • how do you want to output the `position` in a monthly overview? How could an aggregate function for this look like? – vlad_tepesch Jun 14 '19 at 07:50
  • which version of SQL are you using? and what engine? For example, it could be PostgreSQL 8.5, or MySQL 5.6 or something like that. The version number may not be much significant here but let me know the data base engine – Rohan Kumar Jun 16 '19 at 08:16

3 Answers3

1

This is a simple group by and apply the avg (average) function.

select Dname, timestampdiff(year,Ddob,Now()) as age, 
       Dgrade, avg(TripRating) as rate
from booking
     inner join driver on driver.driverID = booking.driverID
where extract(year from TDate) = 2019 and
      extract(month from TDate) = 1
group by Dname, timestampdiff(year,Ddob,Now()), Dgrade

It will return your averages for January 2019

PS: As @RohanKumar made me notice, my answer won't return any result for Drivers without a Booking on that period. If you want the result set to have all the drivers, even the ones without bookings, then we change the order of joining, using Driver as the main table, and changing the join for an outer join, so we return Drivers without bookings.

select Dname, timestampdiff(year,Ddob,Now()) as age, 
       Dgrade, avg(TripRating) as rate
from driver
     left join booking on booking.driverID = driver.driverID
where extract(year from TDate) = 2019 and
      extract(month from TDate) = 1
group by Dname, timestampdiff(year,Ddob,Now()), Dgrade
Marc Guillot
  • 6,090
  • 1
  • 15
  • 42
  • but it stated year and month is an invalid identifier – Shu Yi Jun 15 '19 at 07:22
  • I have replaced the year() and mont() functions by the extract() function. Please, try again. – Marc Guillot Jun 15 '19 at 20:24
  • @ShuYi You must have a very old MySQL version, please tell us which it is. (and take a look at the updated answer). – Marc Guillot Jun 15 '19 at 20:30
  • This is a good answer and should be accepted. My only qualm with this is I'd like to understand the reason for using an inner join when by OPs words they want all drivers' – Rohan Kumar Jun 16 '19 at 08:30
  • information, I just wanted to understand if there are performance benefits over mine because of the way your query is written – Rohan Kumar Jun 16 '19 at 08:30
  • 1
    @RohanKumar, you are right, your answer returns all the drivers, even the ones without bookings, while my answer only returns the drivers with bookings in the period. I have updated my answer to include a second query that returns all the drivers, so the OP can chose. This second query behaves exactly the same as your answer, but I prefer it because I think that the syntax is a bit simpler and easier to understand for the OP. Regards. – Marc Guillot Jun 16 '19 at 10:50
0

That isn't so hard at all! You just need to join the tables, group by driver ID, then just use AVG() function method!

SQL:

SELECT Dname as driver, 
       TIMESTAMPDIFF(YEAR, DDOB, CUR_DATE()) as age,
       Dgrade as position,
       rating
FROM driver 
LEFT JOIN 
(SELECT driverID, AVG(TripRating) as rating
 FROM booking group by driverID, MONTH(TDate)
) booking
on driver.driverID = booking.driverID

This will list out all Driver Ratings for all months, for any given month simply change the inner SQL of booking table.

That change would be so for January (Month 1):

SELECT driverID, avg(TripRating) as rating
 FROM booking group by driverID
where MONTH(TDate) = 1

helpful references

Rohan Kumar
  • 726
  • 10
  • 17
  • SELECT Dname as drivername, Dgrade as position FROM driver left outer join (SELECT driverID, avg(TripRating) as rating FROM booking group by driverID,Mon(TDate) = 4) booking on driver.driverID = booking.BookID – Shu Yi Jun 15 '19 at 05:00
  • ERROR at line 6: ORA-00907: missing right parenthesis – Shu Yi Jun 15 '19 at 05:00
  • Hi Shu Yi! Sorry! I hadn't tested this. But if you can notice the error is in : `on driver.driverID = table.bookingID` that shouldn't be `table.bookingID` but `booking.bookingID` – Rohan Kumar Jun 16 '19 at 08:14
  • @Shu Yi, the SQL should work now! Please have a look and let me know so I can help you with the errors you might be facing, also don't forget to mention the data base engine, and version as others have suggested you might be using an older version. – Rohan Kumar Jun 16 '19 at 08:25
  • It is still wrong - why would you join on the primary ids of two tables ;). should be `on driver.driverID = booking.driverID` – vlad_tepesch Jun 17 '19 at 08:17
  • haaha so obvioius!!! I should really be testing this like a monkey! :DD – Rohan Kumar Jun 17 '19 at 11:16
0

first you have to join your tables and select the columns you need for your calculations:

SELECT Dname, Triprating, DDOB, TDate 
FROM  driver, booking 
WHERE booking.driverID = driver.driverID;

the drivers age is a bit tricky. According to https://stackoverflow.com/a/23824981/2331592 this could look like (assuming that you want the age at the time of the booking)

TIMESTAMPDIFF(YEAR, TDate , DDOB)

also the booking months: I would suggest to also take the year into account, maybe by formating it into '2019-06' by

DATE_FORMAT(TDate ,'%Y-%m')

so the query would now lokk like

SELECT Dname, TIMESTAMPDIFF(YEAR, TDate , DDOB) AS Dage, DATE_FORMAT(TDate ,'%Y-%m') AS Tmonth,  Triprating 
FROM  driver, booking 
WHERE booking.driverID = driver.driverID;

Now this is a table with all information and we want to group some things.
For this we need aggregate functions for the fields we do not want to group on.

I use the previous query as a sub query here. But also could be done in a single one but so it is more clear.

SELECT  Dname as driver, 
        MAX(Dage) as age, 
        Tmonth as month, 
        AVG(Triprating) as 'average rating'
FROM    (SELECT Dname, 
                TIMESTAMPDIFF(YEAR, TDate , DDOB) AS Dage, 
                DATE_FORMAT(TDate ,"%Y-%m") AS Tmonth,  
                Triprating 
         FROM   driver, booking 
         WHERE  booking.driverID = driver.driverID
        )
GROUP BY driver, month
ORDER BY driver ASC, month ASC;

with light adaptions to SQLite the above SQL works and can be tried at: https://sqliteonline.com/#fiddle-5d04d5f898b5cmzxjwxfrdws

Edit:

as Rohan Kumar in his answer suggests, the grouping of the booking data could be done before the join. this would reduce the number of rows to join and may a bit faster and less resource consuming. But in this variant the age of the driver at the booking time can not be calculated that easy.

vlad_tepesch
  • 6,681
  • 1
  • 38
  • 80
  • from keyword not found where expected :( – Shu Yi Jun 15 '19 at 07:26
  • 1
    @ShuYi i wrote it without testing becaue you have not posted an easy to reproduce example - just look for the errors and fix it (in principple should work) or post a sql script that creates a ready to use database with some entries (create table+inserts) for easy testing – vlad_tepesch Jun 15 '19 at 10:47
  • very thought-through answer! btw, Isn't that a cross join? @vlad Also, I don't see why age of driver at booking time wouldn't be easy to compute using my SQL.. Thank you! – Rohan Kumar Jun 16 '19 at 08:33
  • @RohanKumar it is not a "cross join" but an "Equijoin" a cross join is a join without a condition, as far as i know. with "It is not that easy" i meant that you have to so a few additinal steps, like output the booking date and (more important) also select an aggregate function to be able to use that result in the age calculation. – vlad_tepesch Jun 17 '19 at 08:12