0

Okay so I have a table called posts and I am trying to remove the duplicates on the condition that if one of the post_id is equal to the post_dat_id and only remove it if post_data_id == post_id has a post_type of 2.

Posts Structure

post_id || post_data_id || post_type || post_user_id
==================================
1 || NULL || 0 || 10210

2 || 1 || 2 || 201020 <-- it shouldn't show this one because it meets the condition that post_type == 2 && the post_data_id is equal to one of the post_id. EDIT: also notice how this id = 2 but the post_data_id = 1. It is impossible for a row ot have the same post_id & post_data_id

2 || 1 || 0 || 202020

3 || 6 || 2 || 202020

My mysql:

 SELECT p.*
 FROM posts p
 LEFT JOIN following f ON f.user_id =1
 AND p.post_user_id = f.follower_id 
 WHERE post_user_id =1
 OR f.user_id IS NOT NULL 
 ORDER BY  `p`.`post_time` DESC 
 LIMIT 0 , 10

EDIT: I do not want to delete the code all I want to do is not show that result if the meet the criteria in my select. Also I am already using a left join because my sql code needs to check the following table to get the user id's

EDIT 2: I also changed the post_data to post_data_id so its a pure int now

Strawberry
  • 33,750
  • 13
  • 40
  • 57
arberb
  • 960
  • 3
  • 14
  • 25
  • [This previous question](http://stackoverflow.com/questions/18932/how-can-i-remove-duplicate-rows) is about the same and should provide some answers. – J Delaney Aug 07 '13 at 18:42

2 Answers2

0

Haven't tested this code but I would do something like this:

select from posts p where p.id not in
(select q.id from posts q
where q.post_data = q.post_id and q.post_type = 2);
Joe Minichino
  • 2,793
  • 20
  • 20
0

I am not sure that I have fully understood your question but it sounds like you need to use a LEFT JOIN to remove the rows matching the given criteria -

SELECT p1.*
FROM posts p1
LEFT JOIN posts p2
    ON p1.post_data_id = p2.post_id
    AND p1.post_type = 2
LEFT JOIN following f
    ON f.follower_id = 1
    AND p1.post_user_id = f.user_id 
WHERE p2.post_id IS NULL
AND (p1.post_user_id = 1 OR f.user_id IS NOT NULL)
ORDER BY  p1.post_time DESC 
LIMIT 0, 10

This will have quite a performance hit as your dataset grows and it looks like you have some issues with the structure of the data. You might want to look at moving the self referencing foreign key to its own field.

user1191247
  • 10,808
  • 2
  • 22
  • 32
  • the problem with this is that it removes all the data with post_type = 2 because the join looks at every single value in posts when it should be only looking at the ones that meet the second joins condition – arberb Aug 07 '13 at 20:53
  • No. It joins both on both criteria. I have just run it based on your data above and it only drops the second row. – user1191247 Aug 07 '13 at 21:19
  • THanks for the help but it still seems not not be showing. Notice how I am following user #3 but it doesn't show any of his posts with a post_type 2 (I did some modification to the sql) [1]: [SQLFiddle](http://sqlfiddle.com/#!2/89017/1) – arberb Aug 07 '13 at 22:58
  • The only one that should show is post #38 and that was not being displayed because of your limit clause. – user1191247 Aug 07 '13 at 23:41
  • #165 is supposed to show aswell since I am not following user 2 – arberb Aug 08 '13 at 00:01
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/35044/discussion-between-nnichols-and-arberb) – user1191247 Aug 08 '13 at 00:06