1

I need to create a Unique-ID (string) for each record as I am developing an application which allows users to access a unique URL like:

http://URL.com/BXD31F

The code below works to create the URLIDs:

Public Function getURLID(ID As Double) As String

Randomize
Dim rgch As String
rgch = "23456789ABCDEFGHJKLMNPQRSTUVWXYZ"

Dim i As Long
For i = 1 To 5
    getURLID = getURLID & Mid$(rgch, Int(Rnd() * Len(rgch) + 1), 1)
Next

End Function

How can I ensure that the URLID created is unique? Do I need to query the database to ensure it has not been generated before? The table has 5 million records. A dlookup query would exceed the limitations of my MSAccess database.

I have considered using the timestring to generate the URLID:

 Format(Now, "yymmddhhmmss")

However, I only want a simple 5 character string.

Daniel Widdis
  • 8,424
  • 13
  • 41
  • 63
DoubleA
  • 736
  • 1
  • 7
  • 23
  • 3
    Access can generate an ID column that is guaranteed to be unique per table. What's wrong with using that? – Tomalak Feb 12 '20 at 19:20
  • 1
    @Tomalak presumably the idea is to prevent URL spoofing (?); i.e. if I get a link that's `url.com/12345`, I could just change the URL to `url.com/12346` to see what I get, and depending on what that's linking to, I wouldn't want this to be easy to do. A hash of the row ID could work, but wouldn't be much more secure. That said the OP's example URL is `http`, so who knows how secure it needs to be. – Mathieu Guindon Feb 12 '20 at 19:33
  • 2
    Depends on the type of the application. If the content is public, who cares if someone checks out a neighboring URL. If the content is not public, URL parameters should not take the part of authentication. If the content is half-public (whatever that means) you can encode the number in some kind of custom base-N notation to make it somewhat less predictable and look less numerical, like link shorteners do it. – Tomalak Feb 12 '20 at 19:36
  • I think *that* is what the OP is looking for – Mathieu Guindon Feb 12 '20 at 19:36
  • Does this answer your question? [Generate a UNIQUE string made out of numeric / ALPHABET](https://stackoverflow.com/questions/44934740/generate-a-unique-string-made-out-of-numeric-alphabet) – Erik A Feb 12 '20 at 20:51

3 Answers3

2

How can I ensure that the URLID created is unique?

You can't. And it won't be. Look into cryptographically secure hashing algorithms... and even those are never "secure" forever. Note, hashing is something for which VBA has absolutely zero built-in support, but you can leverage .NET for that.

Another option could be to get the OS to generate Globally Unique IDentifiers (GUID); these would be unique, ...but much longer than a handful of characters.

Good luck!

Mathieu Guindon
  • 69,817
  • 8
  • 107
  • 235
  • 1
    GUIDs are supposed to be "guaranteed unique" across all computers. (Raymond Chen has a nice post on that: https://devblogs.microsoft.com/oldnewthing/20080627-00/?p=21823) – Tomalak Feb 12 '20 at 19:21
  • 1
    @Tomalak right. 48 bits is *a lot* of computers... but not infinite – Mathieu Guindon Feb 12 '20 at 19:25
  • 2
    Well, yes. Technically the number of GUIDs is limited, but the chances are infinitesimal that ever a GUID is generated twice. They are unique for all practical purposes, literally nothing a web application needing URL slugs should be worried about. – Tomalak Feb 12 '20 at 19:30
  • So guys, never generate a GUID just for the fun of it. They are not enough for everybody :) – Vityata Feb 12 '20 at 19:34
  • 2
    @Vityata everytime a GUID is generated, a star goes supernova somewhere =) – Mathieu Guindon Feb 12 '20 at 19:36
  • As opposed to .Net, you can also leverage WinAPI to do encryption/decryption. [CNG](https://learn.microsoft.com/en-us/windows/win32/seccng/cng-portal) is relatively easy to work with from VBA, and can work faster (has a little less overhead), which is especially important if you're doing key derivation/repeat hashing to make cracking the hash harder. – Erik A Feb 12 '20 at 20:59
1

Ensuring that a string is unique with VBA could be done somehow differently. E.g., take the date time, which is unique every second and give it:

format(now, "YYMMDDHHNS")

As far as it would be too obvious, consider changing it a bit. E.g., remove a random contant number from the datetime, let's say 181387 (as it is a prime number) and pass convert it to a hex. Then it would be quite ok:

Function UniqueString() As String

    Const someNumber = 181387 'it is a prime number
    UniqueString = Hex(Format(Now, "YYMMDDHHNS") - someNumber)

End Function

The above does not seem to work for 32-bit machines. Thus, you may consider splitting the parts of the date to separate numbers and hex-ing them separately:

Function UniqueString32() As String

    Const primeNumber = 23        
    Application.Wait Now + #12:00:02 AM#    'waiting 2 seconds
    UniqueString32 = Hex(Format(Now, "YY")) _
                    & Hex(Format(Now, "MM")) _
                    & Hex(Format(Now, "DD")) _
                    & Hex(Format(Now, "HH")) _
                    & Hex(Format(Now, "NS") - primeNumber)

End Function

Just make sure there is at least 1 second before calling the function, calling it in the same time zone. Plus, it is a good idea to think about the daylight saving time in advance. In general, it is not a great idea, there would be quite a lot of problems popping up, but for and it would be ok.

Vityata
  • 42,633
  • 8
  • 55
  • 100
  • Using the time is a good idea (+1)... but then it needs sub-millisecond granularity to be reliable, and should use UTC time, otherwise DST will make a whole set of values non-unique. Not sure about removing a random value from it though - that feels like it *guarantees* non-uniqueness. – Mathieu Guindon Feb 12 '20 at 19:12
  • Is this tested? Gives me Overflow. – BigBen Feb 12 '20 at 19:19
  • @MathieuGuindon - in general, it is possibly a good idea to put the unique string to the end of the random function. E.g., something like `BXD31F` + `UniqueString()`. Or even `BXD` + `UniqueString()` + `31F`, if the function producing the `BXD31F` always return a string with 6 chars. – Vityata Feb 12 '20 at 19:19
  • Ok, but a string that's constrained to 6 chars, *by definition*, cannot possibly be unique in any way. – Mathieu Guindon Feb 12 '20 at 19:22
  • @BigBen - I get `2E9D90135E` just pasting the code. – Vityata Feb 12 '20 at 19:22
  • What does `Format(Now, "YYMMDDHHMS")` return for you? Returns something like `200212142311` for me. – BigBen Feb 12 '20 at 19:23
  • @BigBen - 20021221246. – Vityata Feb 12 '20 at 19:24
  • @Vityata - `? Hex(20021221246 - 181387)` returns Overflow. – BigBen Feb 12 '20 at 19:25
  • @BigBen - Strange. I get a normal answer in the immediate window - `4A958D2F3`. But I am with 64 bits Excel, if that matters... – Vityata Feb 12 '20 at 19:26
  • 1
    Hmmm. That might be the culprit. I am on 32 bit. – BigBen Feb 12 '20 at 19:32
  • @BigBen - if you change the constant to `181387#` would it work? – Vityata Feb 12 '20 at 19:33
  • No that doesn't do it. Eh tbh I'm not sure it's worth our time. – BigBen Feb 12 '20 at 19:35
  • @BigBen - ok. If you find a way to make it work (idk how, probably `Hex$` (but I doubt, feel free to edit. – Vityata Feb 12 '20 at 19:39
  • 1
    No I'm guessing we're dealing with `LongLong` versus `Long` or something of that sort... seems like Hex bugs out on 32-bit with a number the size of `20021221246`. – BigBen Feb 12 '20 at 19:41
0

I managed to solve my own problem. We need to check to see if the URLID already exists in the table. The challenge is that the URLID is not written into the table until the query has completely executed. Using 6 of the possible 24 characters will give us about 191 million possibilities (24 to the power of 6). As we only need to create 5 million IDs, there is a small chance for duplicate records.

This is how I did it:

Step 1 - Generate Random a URLID for the 5 million rows using the original code

Step 2 - Identify duplicates and update to null using query below

 UPDATE URLIDs SET URLIDs.URL = Null
 WHERE (((URLIDs.URL) In (SELECT [URL] FROM [URLIDs] As Tmp GROUP BY [URL] HAVING 
 Count(*)>1 )));

Step 3 - Generate new URLID for the nulls identified in Step 2. This time, checking to see if they already exist in the table. See code below:

Public Function getURLID(roll As Double) As String
Randomize
Dim rgch As String
rgch = "ABCDEFGHJKLMNPQRSTUVWXYZ"
Dim i As Long

For i = 1 To 6
        getURLID = getURLID & Mid$(rgch, Int(Rnd() * Len(rgch) + 1), 1)
Next

Do Until URLIDExists(getURLID) = False
    getURLID = ""

    For i = 1 To 6
        getURLID = getURLID & Mid$(rgch, Int(Rnd() * Len(rgch) + 1), 1)
    Next
Loop
End Function

Function below used to see if URL exists

Public Function URLIDExists(URLID As String) As Boolean
Dim RS1
Dim strQuery As String
strQuery = "SELECT * from [URLIDs] where [URL]='" & URLID & "'"
Set RS1 = CurrentDb.OpenRecordset(strQuery)
If RS1.RecordCount > 0 Then
URLIDExists = True
Else
URLIDExists = False
End If
Set RS1 = Nothing
End Function

I repeated steps 2 and 3 until there are were no more duplicates. Each time checking against the existence of the already confirmed URLID. Eventually there will be no more duplicate URLIDs.

DoubleA
  • 736
  • 1
  • 7
  • 23