3

I'm very new in SQL, please help me to solve the problem.

I have a table for site's private messages, that i got after migration between engines (phpbb to drupal)

Lets say, it has 3 columns:

  • mid - for message id
  • thread - for thread id
  • recipient - user id, who will get the message.

Every message is displayed in recipient's and in autor's mailbox, so i have 2 strings for every message.

The problem is that the thread id's for the messages are incorrect (basically, they are equal to mid), so correspondence between 2 users displays as a hundred of seperate threads.

For example:

user 100 wrote message to user 101 
user 101 replied       to user 100
user 102 wrote message to user 100

The table looks like:

 ________________________________
| mid    | thread    | recipient |
|        |           |           |
| 1      | 1         | 101       |  ← message 1 in recipient's mailbox    
| 1      | 1         | 100       |  ← message 1 in author's mailbox
| 2      | 2         | 100       |    ...
| 2      | 2         | 101       |
| 3      | 3         | 100       |
| 3      | 3         | 102       |
|________|___________|___________|

My goal is to give the same thread id to every group of pair lines with the same pair of recipients:

 ________________________________
| mid    | thread    | recipient |
|        |           |           |
| 1      | 1         | 101       |  }
| 1      | 1         | 100       |  }    Correspondence between 100 and 101 must
| 2      | 1         | 100       |  }    have the same thread id (1)
| 2      | 1         | 101       |  }
| 3      | 2         | 100       |
| 3      | 2         | 102       |
|________|___________|___________|

UPDATE:

Thread id must be the same for all present conversations between the same recipient's pairs. (Don't worry about further conversations, users will have an option to open new thread when writing message, that will generate new thread id, or reply in existing thread, that will mark message with existing thread id. All i want is to gather tonnes of existing messages between each 2 users in threads).

I imagine that like some sort of cycle, that will search all the recipients id's for every mid and merge them in array after ascending sort:

  • For mid 1: (100, 101)
  • For mid 2: (100, 101)
  • For mid 3: (100, 102)

Then give the same arrays same id that will be my desired thread id:

  • (100, 101) = 1
  • (100, 101) = 1
  • (100, 102) = 2

Not sure, if my algorithm is possible with only SQL query

  • is the recipient column name really named as `recipient`? i'm guessing it's a field to have user ids (both recipient and author). Also are u asking u for an update query here hence? – nawfal Mar 21 '13 at 20:28
  • 1
    nawfal, Yes, it's really named as 'recipient'. The trick is that if I write message to somebody, i will also become a recipient so that my message will be displayed on my 'messages' page. I'd be happy to get SQL query that updates thread column (if it is possible) – Veronica Abazova Mar 21 '13 at 20:52
  • 1
    What does your insert query look like? It seems to me that the problem is in your programming language, not your DB. – invertedSpear Mar 21 '13 at 21:37
  • invertedSpear, It doesn't matter, how does insert query look. I just need to fix present table, all the later queries will go fine. If you still interested, i used phpbb2drupal module to migrate phpbb db to drupal db with the migrate api. – Veronica Abazova Mar 22 '13 at 01:04
  • what is the significance of records `2, 2, 100` and `2, 2, 101` here? also on what basis should the thread id be the same? what if there is a conversation btween 101 and 100 at some point later, which would have a thread id of 25, say? do you mean to say club those which come consecutively? – nawfal Mar 22 '13 at 19:11
  • nawfal, those records means, that user 101 wrote message (mid=2) to user 100. As this message appears both in sender and recipient's mailbox, there are 2 records, like the message was sent both to author and recipient. I've updated post to answer other questions. – Veronica Abazova Mar 23 '13 at 01:01
  • @VeronicaAbazova are u sure right now the threadid and mid are always equal? And please include my name as "@nawfal" so that I know and not just "nawfal".. – nawfal Mar 23 '13 at 07:04
  • @nawfal, yes, i'm sure. That is how script, that converted phpbb bd to drupal, works. – Veronica Abazova Mar 23 '13 at 17:05
  • @VeronicaAbazova ok let me post an answer that just "might" work. I'm not sure how good I can be.. – nawfal Mar 23 '13 at 18:07
  • @nawfal ok, i'll wait with impatience. (As i understood, I don't need to make any actions to let you do this) – Veronica Abazova Mar 23 '13 at 20:09

2 Answers2

0

This sort of correction should be better handled from your application as it's easier to handle logic there than at this functional style Sql.

Here is one way of getting it done (This assumes all the related (threaded) conversations come one after the other, as you have confirmed):

SELECT   mid, 
         thread, 
         recipient 
FROM     (  
          SELECT row, 
                 mid, 
                 recipient,
                 IF(row % 2 = 0, @evenThread := thread, NULL) AS tempThread,
                 IF(row % 2 = 1, @evenThread, thread) AS thread
          FROM   (
                  SELECT   @row := @row + 1 as row, 
                           mid, 
                           IF(@row = 1, @thread := thread, NULL) AS tempThread,                
                           IF(@row % 2 = 1, IF(@row = 1, @rec1 := recipient, @rec1 := @rec3), NULL) AS tempOddRow,
                           IF(@row % 2 = 1, @rec3 := recipient, recipient) AS recipient,         
                           IF(@row % 2 = 0 AND @row > 2, IF((recipient != @rec1 OR @rec3 != @rec2) AND (recipient != @rec2 OR @rec3 != @rec1), @isDifferent := true, @isDifferent := false), NULL) AS isDifferent,
                           IF(@row % 2 = 0, @rec2 := recipient, NULL) AS tempEvenRow,
                           IF(@row % 2 = 0, IF(@isDifferent, @thread := @thread + 1, @thread), @thread) AS thread
                  FROM     (SELECT @row := 0) AS r, 
                           (SELECT @thread := 0) as t,
                           (SELECT @isDifferent := false) as d,
                           (SELECT @rec1 := 0) AS r1,
                           (SELECT @rec2 := 0) AS r2, 
                           (SELECT @rec3 := 0) AS r3, 
                           messages --your table name
                  ORDER BY row DESC
                 ) AS temp, 
                 (SELECT @evenThread := 0) AS t
         ) AS corrected 
ORDER BY row

This is just the select query. Test it here http://sqlfiddle.com/#!9/61247/2. You can copy that into a temporary table, and first confirm if all data has been proper rather than updating the original table which will make debugging harder. Something along the lines of:

CREATE TABLE messages_new LIKE messages; --new table name given
INSERT INTO  messages_new (mid, thread, recipient)
SELECT       mid, 
             thread, 
             recipient 
FROM         (  
              SELECT row, 
                     mid, 
                     recipient,
                     IF(row % 2 = 0, @evenThread := thread, NULL) AS tempThread,
                     IF(row % 2 = 1, @evenThread, thread) AS thread
              FROM   (
                      SELECT   @row := @row + 1 as row, 
                               mid, 
                               IF(@row = 1, @thread := thread, NULL) AS tempThread,                
                               IF(@row % 2 = 1, IF(@row = 1, @rec1 := recipient, @rec1 := @rec3), NULL) AS tempOddRow,
                               IF(@row % 2 = 1, @rec3 := recipient, recipient) AS recipient,         
                               IF(@row % 2 = 0 AND @row > 2, IF((recipient != @rec1 OR @rec3 != @rec2) AND (recipient != @rec2 OR @rec3 != @rec1), @isDifferent := true, @isDifferent := false), NULL) AS isDifferent,
                               IF(@row % 2 = 0, @rec2 := recipient, NULL) AS tempEvenRow,
                               IF(@row % 2 = 0, IF(@isDifferent, @thread := @thread + 1, @thread), @thread) AS thread
                      FROM     (SELECT @row := 0) AS r, 
                               (SELECT @thread := 0) as t,
                               (SELECT @isDifferent := false) as d,
                               (SELECT @rec1 := 0) AS r1,
                               (SELECT @rec2 := 0) AS r2, 
                               (SELECT @rec3 := 0) AS r3, 
                               messages
                      ORDER BY row DESC
                     ) AS temp, 
                     (SELECT @evenThread := 0) AS t
             ) AS corrected 
ORDER BY     row;
DROP TABLE   messages;
ALTER TABLE  messages_new RENAME TO messages; --back to old name
nawfal
  • 70,104
  • 56
  • 326
  • 368
  • It does'n work :-( I've exported query results as php array: array('row'=>'1','mid'=>'1','thread'=>'1','recipient'=>'100'), array('row'=>'2','mid'=>'1','thread'=>'1','recipient'=>'101'), array('row'=>'3','mid'=>'2','thread'=>'2','recipient'=>'100'), array('row'=>'4','mid'=>'2','thread'=>'2','recipient'=>'102'), array('row'=>'5','mid'=>'3','thread'=>'3','recipient'=>'100'), array('row'=>'6','mid'=>'3','thread'=>'3','recipient'=>'101') I still got 3 different thread id's :-( – Veronica Abazova Mar 23 '13 at 22:16
  • Neil Mussett posted a working solution, but still thank you for trying to help! Cheers! – Veronica Abazova Mar 24 '13 at 01:18
  • @VeronicaAbazova I had made a small mistake there. Try now it works. Neil's solution is great, but works only when assuming you will have only at most two pairs of author and recipient for a single thread. See this sqlfiddle to check it http://sqlfiddle.com/#!9/61247/3 – nawfal Mar 26 '13 at 12:11
0

I put this into a SQL Fiddle here: http://sqlfiddle.com/#!2/83bdc/25

Assuming your table name is messages then here is what I would do:

Select messages.*, FirstThread 
From messages

Inner Join

(
-- Find a pair of dudes for each message and show the earliest thread of each
Select ThreadsForPairs.*, FirstThread From

(

Select mid
, Min(recipient) AS FirstDude
, Max(recipient) AS SecondDude
, thread

From messages

Group By mid, thread

) ThreadsForPairs

Left Join

(
-- Find the earliest thread for every pair of dudes
Select FirstDude, SecondDude, Min(thread) AS FirstThread

From

-- For every message, get the dudes involved
(
Select mid
, Min(recipient) AS FirstDude
, Max(recipient) AS SecondDude
, thread

From messages

Group By mid, thread
) PairsForMessages

Group By FirstDude, SecondDude

) FirstThreadForPairs

ON ThreadsForPairs.FirstDude = FirstThreadForPairs.FirstDude
AND ThreadsForPairs.SecondDude = FirstThreadForPairs.SecondDude

) FirstThreadForEveryMessage

On messages.mid = FirstThreadForEveryMessage.mid

As you will see, I do not match your output exactly because message ID 3 would have thread 3 instead of 2, but I think it is the same idea...

Neil Mussett
  • 710
  • 6
  • 8