Here's my approach that guarantees uniqueness and introduces some randomness.
- Use a sequence generator that is guaranteed to give a unique number. Since you're working with SQL Server, this can be an
IDENTITY
column's value. You could alternatively increment an application-level value within your C# code to achieve this.
- Generate a random integer to bring in some randomness to the result. This could be done with
Random.Next()
and any seed, even the number generated in the preceding step.
- Use a method
EncodeInt32AsString
to convert the integers from the previous two steps into two strings (one is the unique string, one the random string). The method returns a string composed of only the allowed characters specified in the method. The logic of this method is similar to how number conversion between different bases takes place (for example, change the allowed string to only 0-9, or only 0-9A-F to get the decimal/hex representations). Therefore, the result is a "number" composed of the "digits" in allowedList
.
- Concatenate the strings returned. Keep the entire unique string as-is (to guarantee uniqueness) and add as many characters from the random string to pad the total length to the desired length. If required, this concatenation can be fancy, by injecting characters from the random string at random points into the unique string.
By retaining the entire unique string, this ensures uniqueness of the final result.
By using a random string, this introduces randomness. Randomness cannot be guaranteed in case the target string's length is very close to the length of the unique string.
In my testing, calling EncodeInt32AsString
for Int32.MaxValue
returns a unique string 6 characters long:
2147483647: ZIK0ZJ
On that basis, a target string length of 12 will be ideal, though 10 is also reasonable.
The EncodeInt32AsString
Method
/// <summary>
/// Encodes the 'input' parameter into a string of characters defined by the allowed list (0-9, A-Z)
/// </summary>
/// <param name="input">Integer that is to be encoded as a string</param>
/// <param name="maxLength">If zero, the string is returned as-is. If non-zero, the string is truncated to this length</param>
/// <returns></returns>
static String EncodeInt32AsString(Int32 input, Int32 maxLength = 0)
{
// List of characters allowed in the target string
Char[] allowedList = new Char[] {
'0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J',
'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T',
'U', 'V', 'W', 'X', 'Y', 'Z' };
Int32 allowedSize = allowedList.Length;
StringBuilder result = new StringBuilder(input.ToString().Length);
Int32 moduloResult;
while (input > 0)
{
moduloResult = input % allowedSize;
input /= allowedSize;
result.Insert(0, allowedList[moduloResult]);
}
if (maxLength > result.Length)
{
result.Insert(0, new String(allowedList[0], maxLength - result.Length));
}
if (maxLength > 0)
return result.ToString().Substring(0, maxLength);
else
return result.ToString();
}
The GetRandomizedString
Method
Now, the preceding method just takes care of encoding a string. In order to achieve the uniqueness and randomness properties, the following logic (or similar) can be used.
In the comments, Kevin pointed out the following risk with the implementation of the EncodeInt32AsString
method:
The code needs to be tweaked so that it returns a fixed-length string.
Otherwise, you can never be guaranteed of the final result is unique.
If it helps, picture one value generating ABCDE (Unique) +
F8CV1 (Random)... and then later on, another value generating
ABCDEF (Unique) + 8CV1 (Random). Both values are ABCDEF8CV1
This is a very valid point, and this has been addressed in the following GetRandomizedString
method, by specifying lengths for the unique and random strings. The EncodeInt32AsString
method has also been modified to pad out the return value to a specified length.
// Returns a string that is the encoded representation of the input number, and a random value
static String GetRandomizedString(Int32 input)
{
Int32 uniqueLength = 6; // Length of the unique string (based on the input)
Int32 randomLength = 4; // Length of the random string (based on the RNG)
String uniqueString;
String randomString;
StringBuilder resultString = new StringBuilder(uniqueLength + randomLength);
// This might not be the best way of seeding the RNG, so feel free to replace it with better alternatives.
// Here, the seed is based on the ratio of the current time and the input number. The ratio is flipped
// around (i.e. it is either M/N or N/M) to ensure an integer is returned.
// Casting an expression with Ticks (Long) to Int32 results in truncation, which is fine since this is
// only a seed for an RNG
Random randomizer = new Random(
(Int32)(
DateTime.Now.Ticks + (DateTime.Now.Ticks > input ? DateTime.Now.Ticks / (input + 1) : input / DateTime.Now.Ticks)
)
);
// Get a random number and encode it as a string, limit its length to 'randomLength'
randomString = EncodeInt32AsString(randomizer.Next(1, Int32.MaxValue), randomLength);
// Encode the input number and limit its length to 'uniqueLength'
uniqueString = EncodeInt32AsString(input, uniqueLength);
// For debugging/display purposes alone: show the 2 constituent parts
resultString.AppendFormat("{0}\t {1}\t ", uniqueString, randomString);
// Take successive characters from the unique and random strings and
// alternate them in the output
for (Int32 i = 0; i < Math.Min(uniqueLength, randomLength); i++)
{
resultString.AppendFormat("{0}{1}", uniqueString[i], randomString[i]);
}
resultString.Append((uniqueLength < randomLength ? randomString : uniqueString).Substring(Math.Min(uniqueLength, randomLength)));
return resultString.ToString();
}
Sample Output
Calling the above method for a variety of input values results in:
Input Int Unique String Random String Combined String
------------ ----------------- -------------- ---------------------
-10 000000 CRJM 0C0R0J0M00
0 000000 33VT 03030V0T00
1 000001 DEQK 0D0E0Q0K01
2147 0001NN 6IU8 060I0U18NN
21474 000GKI VNOA 0V0N0OGAKI
214748 004LP8 REVP 0R0E4VLPP8
2147483 01A10B RPUM 0R1PAU1M0B
21474836 0CSA38 RNL5 0RCNSLA538
214748364 3JUSWC EP3U 3EJPU3SUWC
2147483647 ZIK0ZJ BM2X ZBIMK20XZJ
1 000001 QTAF 0Q0T0A0F01
2 000002 GTDT 0G0T0D0T02
3 000003 YMEA 0Y0M0E0A03
4 000004 P2EK 0P020E0K04
5 000005 17CT 01070C0T05
6 000006 WH12 0W0H010206
7 000007 SHP0 0S0H0P0007
8 000008 DDNM 0D0D0N0M08
9 000009 192O 0109020O09
10 00000A KOLD 0K0O0L0D0A
11 00000B YUIN 0Y0U0I0N0B
12 00000C D8IO 0D080I0O0C
13 00000D KGB7 0K0G0B070D
14 00000E HROI 0H0R0O0I0E
15 00000F AGBT 0A0G0B0T0F
As can be seen above, the unique string is predictable for sequential numbers, given it is just the same number represented in a different base. However, the random string brings in some entropy to prevent users from guessing subsequent numbers. Moreover, by interleaving the "digits" of the unique string and random string it becomes slightly more difficult for users to observe any pattern.
In the above example, the length of the unique string is set to 6 (since that allows it to represent Int32.MaxValue
), but the length of the random string is set to 4 because the OP wanted a total length of 10 characters.