1

I am building a mysql based chat application.

My database schema has the following tables,

       Users               Messages
   =================   =================
        id                 id
        screen_name        message
                           from
                           to
                           timestamp

The from and to fields on the messages table contain the id's of the users that sent and received each message.

I am trying to display all messages between a user ($id) and one of their friends ($friend). My query is the following:

$query = "SELECT messages.* , users.screen_name FROM users CROSS JOIN messages ";
$query .= "ON ( messages.to = $id AND messages.from = $friend ) ";
$query .= "OR ( messages.to = $friend AND messages.from = $id )";

The problem is that every message is twice in the result table.

I tried using DISTINCT but it either doesn't work in this scenario or I used it wrong.

What should my query be in order to have each message between the two users only once?

IT ppl
  • 2,626
  • 1
  • 39
  • 56
ppp
  • 2,035
  • 9
  • 32
  • 52
  • Seems to be doing the trick. Could you please explain it briefly in an answer so I can award rep? – ppp Apr 10 '12 at 13:32
  • 2
    You are selecting `messages.*`. Using `SELECT DISTINCT`, if any one of those fields are different (regardless of whether or not the message is the same), then you will get both rows. I would suggest that you don't use `SELECT *`, but instead explicitly select each individual field that you actually want. See [this post](http://stackoverflow.com/questions/321299/what-is-the-reason-not-to-use-select) for more info. – Travesty3 Apr 10 '12 at 13:34

2 Answers2

3

Something like this should do the trick:

SELECT
  messages.*,
  users_from.screen_name AS from_screen_name,
  users_to.screen_name AS to_screen_name
FROM
  messages
    JOIN users AS users_from ON messages.from = users_from.id
    JOIN users AS users_to ON messages.to = users_to.id
WHERE
  (messages.to = $id AND messages.from = $friend)
  OR ( messages.to = $friend AND messages.from = $id)

What this does is joing the "users" table twice, once on the "to" column and the second time on the "from" column.

Jan Hančič
  • 53,269
  • 16
  • 95
  • 99
1

@Travesty3 has already suggested that the DISTINCT keyword will only exclude duplicate rows where all fields are equal to another row. Therefore, the DISTINCT keyword is not the way to go here.

What you can do, however, is to simply GROUP BY messages.id in order to get only one row per message ID (there is no guarantee, however, as to which of the two rows will be excluded).

jensgram
  • 31,109
  • 6
  • 81
  • 98
  • This seem to work fine, but so does the double join proposed by Jan. Is there a difference in performance between the two? – ppp Apr 10 '12 at 13:54
  • @AnPel Given that the fields used for the `JOIN` s and `GROUP BY` are indexed, I suspect both to perform more than satisfactory. What you could do is to benchmark using example queries on realistic data sets. – jensgram Apr 10 '12 at 13:59