0

I've spent over a week trying to see how to prevent duplicate mysql entries into my table. How can I prevent this from happening? I don't want a user to update his/her location if I have an entry from the last 5 seconds, down to the fractional seconds. For instance, I just want the very first entry, and a way to ignore the subsequent entries.

The type of my column is timestamp(6)

|         40 |       4 |           5 | 2016-03-05 22:19:57.572771 |
|         41 |       4 |           5 | 2016-03-05 22:19:57.574012 |
|         42 |       4 |           5 | 2016-03-05 22:19:57.589436 |
|         43 |       4 |           5 | 2016-03-05 22:19:57.598959 |
|         44 |       4 |           5 | 2016-03-05 22:19:57.605516 |

I've tried things like:

"SELECT * FROM checkins WHERE user_id = " + req.body.userID + " AND TIMESTAMPDIFF(SECOND, entered_at, current_timestamp(6)) < 5;"

AND

"SELECT * FROM checkins WHERE user_id = " + req.body.userID + " AND entered_at <= now() AND entered_at >= date_sub(now(),  interval 1 SECOND);"

None of which worked unfortunately. Thank you in advance for any guidance.

EDIT

For clarity, what I'm doing is trying to see if a particular user has made an entry within the last 5 seconds, down to the fractional second. If he has, then rows.length > 0 will be true when rows returns on the server side. If in fact rows.length IS greater than zero, I set my proceedToUpdateUsersLocation variable to false. I hope that's a clear explanation. I'm trying to prevent multiple entries by checking that there were no duplicates in the table from within the past X seconds.

Mihado
  • 1,487
  • 3
  • 17
  • 30
  • Because I want all of the duplicate entries from the table, so I use select – Mihado Mar 06 '16 at 14:40
  • Ok - but first you talked about preventing duplicates. A select isn't going to cause duplicates – Strawberry Mar 06 '16 at 15:31
  • Oh I see what you're missing. If I return rows.length > 0 on the server it means that I have duplicates, so I set a variable to false so it will not update the users location again. This logic is being done server side. Here I'm just seeing if there was a previous entry, and I check that by seeing what the connection.query returns. Does that make sense? – Mihado Mar 06 '16 at 15:47
  • Not really. What if 'the previous entry' occurs *while* you're checking? – Strawberry Mar 06 '16 at 16:00
  • That's why I check if the table has any entries from that particular user from within the last 5 seconds, it's just not catching the ones that happen within the same second. That's why in my code example they were all entered at the 57th second – Mihado Mar 06 '16 at 16:04
  • Checking first is unnecessary and counter-productive. I'll post an example... – Strawberry Mar 06 '16 at 16:08
  • Please do, and thank you. I hope I made some sense. If not, I posted an edit trying to explain the situation as clearly as possible. – Mihado Mar 06 '16 at 16:11
  • This one looks relevant: http://stackoverflow.com/questions/913841/mysql-conditional-insert – Oleg Komarov Mar 06 '16 at 20:54

2 Answers2

0

The solution was quite simple really, I just don't have the SQL skills necessary. What I did was subtract 4.999999 SECONDS_MICROSECONDS from curtime(6) using the DATE_SUB() function. Here's my final solution.

"SELECT * FROM checkins WHERE user_id = " + req.body.userID + " AND DATE_SUB(curtime(6), INTERVAL 4.999999 SECOND_MICROSECOND) < entered_at;"

Thank you all for your assistance, and I hope this will prove useful to someone.

Mihado
  • 1,487
  • 3
  • 17
  • 30
-1

I think that pretty much any soluiton that does the check outside of the same transaction that inserts the data is bound to fail due to a race condition (unless you implement some locking, which may be a bad idea, it depends).

So, one way to solve it is to make sure that the database itself prevents duplicates.

I propose to pick a canonical value that would represent all the timestamps that you would like to merge, let's call it canonical_timestamp and create a unique constraint UNIQUE (user_id, canonical_timestamp).

You can calculate canonical_timestamp before inserting your data as e.g. int(time()/5) or 5*int(time()/5) or in any other way that fits your purpose of deduplicating the records.

UPDATE

Strictly speaking, this solution doesn't completely solve the problem when we receive two events close to the border:

ts = 5 -> canonical_ts = 1
ts = 6 -> canonical_ts = 2
Oleg Komarov
  • 370
  • 1
  • 9