66

I am having a problem generating a GUID for a string - for example:

Guid g = New Guid("Mehar");

How can I compute a GUID for "Mehar"? I am getting an exception.

J0e3gan
  • 8,740
  • 10
  • 53
  • 80
Mehar
  • 689
  • 1
  • 5
  • 3
  • What do you mean by "generating GUID for strings"? – Jon Skeet Feb 03 '10 at 09:30
  • What are you trying to do Mehar? `Guid(string)` receives a guid formatted string, like `{00000000-0000-0000-0000-000000000000}` – Rubens Farias Feb 03 '10 at 09:31
  • my doubt is i need to generate unique id for a general string(Mehar) which is like this "fc098275-7af6-4780-9bee-624563ec5cb0" – Mehar Feb 03 '10 at 09:48
  • 1
    What are you trying to do? Are you trying to generate a unique value based on the string, in which case you want to hash, e.g. http://stackoverflow.com/questions/2112685/how-do-one-way-hash-functions-work – Unsliced Feb 03 '10 at 09:30
  • 1
    See also here: http://stackoverflow.com/questions/2642141/how-to-create-deterministic-guids – RenniePet May 16 '17 at 21:48

9 Answers9

138

Quite old this thread but this is how we solved this problem:

Since Guid's from the .NET framework are arbitrary 16bytes, or respectively 128bits, you can calculate a Guid from arbitrary strings by applying any hash function to the string that generates a 16 byte hash and subsequently pass the result into the Guid constructor.

We decided to use the MD5 hash function and an example code could look like this:

string input = "asdfasdf";
using (MD5 md5 = MD5.Create())
{
    byte[] hash = md5.ComputeHash(Encoding.UTF8.GetBytes(input));
    Guid result = new Guid(hash);
}

Please note that this Guid generation has a few flaws by itself as it depends on the quality of the hash function! If your hash function generates equal hashes for lots of string you use, it's going to impact the behaviour of your software.

Here is a list of the most popular hash functions that produce a digest of 128bit:

  • RIPEMD (probability of collision: 2^18)
  • MD4 (probability of collision: for sure)
  • MD5 (probability of collision: 2^20.96)

Please note that one can use also other hash functions that produce larger digests and simply truncate those. Therefore it may be smart to use a newer hash function. To list some:

  • SHA-1
  • SHA-2
  • SHA-3

Today (Aug 2013) the 160bit SHA1 hash can be considered being a good choice.

Grzegorz Smulko
  • 2,525
  • 1
  • 29
  • 42
Nachbars Lumpi
  • 2,688
  • 1
  • 21
  • 13
  • 26
    If you're going to create a GUID out of MD5 hash data, you really should follow the standard, and *indicate* that this is a [**Type 3** guid](http://en.wikipedia.org/wiki/Universally_unique_identifier#Version_3_.28MD5_hash.29) - meaning the data comes from an MD5 hash. Type 3 GUIDs are of the form `xxxxxxxx-xxxx-3xxx-yxxx-xxxxxxxxxxxx`, where the **`3`** indicates `Type 3` and y is masked to `10xx`. You can also use SHA1 hashing (Type 5), where you change the 3 to a 5. – Ian Boyd Apr 18 '13 at 15:05
  • 1
    I would add Ian's comment to the answer to help clarify that you can't just transpose the hash into a GUID: certain bits in the GUID need special values. – Mihai Danila Jul 04 '13 at 14:41
  • 2
    We're talking here about a GUID not a UUID. Take note of the difference. – Nachbars Lumpi Aug 06 '13 at 07:03
  • 8
    Nachbars, it would seem that GUID is Microsoft's implementation of UUID, which would imply that the UUID specification applies -- http://stackoverflow.com/questions/246930/is-there-any-difference-between-a-guid-and-a-uuid – Mihai Danila Nov 17 '14 at 19:48
  • So long as you use the same algorithm consistently, it's ok. https://tools.ietf.org/html/rfc4122#section-4.3 – Nachbars Lumpi Jul 01 '15 at 08:50
  • 2
    outside of providing info to a 3rd-party consumer as to which hash algorithm was used to create the `Guid` , why is it important to conform certain bits , e.g. Type 3 Guid ? what role does this info play in comparisons ? – BaltoStar Apr 11 '18 at 02:31
  • The bit flags provide a way to understand the uuid generator function that was used to generate the uuid. Version 1 uuids are generated with a time component. Version 3 and version 5 uuids are generated with some "namespace" and "name" components, and thus are suitable for entities. Version 4 uuids are based on random number generators and thus have no namespace. Understanding the uuid generator matters because it each uuid generator has its own benefits and drawbacks and which one suits best depends on the usecase. – Nachbars Lumpi Jul 17 '18 at 08:02
  • 1
    Warning: MD5 APIs are incompatible with FIPS-enabled operating systems. Use another algorithm if you potentially need to be FIPS-compliant. – Chris Gillum Nov 09 '21 at 21:00
  • 1
    Please use a stable `Encoding` (preferably `Encoding.UTF8`), not `Encoding.Default`. – CodeAngry Feb 06 '22 at 18:47
18

I'm fairly sure you've confused System.Guid with wanting a hash (say, SHA-256) of a given string.

Note that, when selecting a cryptographically-secure hashing algorithm, MD5, SHA0 and SHA1 are all generally considered dead. SHA2 and up are still usable.

Noon Silk
  • 54,084
  • 6
  • 88
  • 105
  • How is SHA2 usable and SHA1 not, if, according to your link, SHA2 has the same status "weakened" as SHA1, only since a later date? – Ruslan Jan 12 '15 at 10:55
  • 3
    This should be a comment. A speculation about the author and a note. Does not answer the question. – Edward Olamisan Sep 09 '20 at 22:21
6

What you are looking for is probably generating version 3 or version 5 UUIDs, which are name based UUIDs. (version 5 is the recommended). I don't think that the .NET framework has build in support for it. See http://en.wikipedia.org/wiki/Universally_Unique_Identifier

I did a few google searches to see if I could find something in the Win32 API, but nothing came up. However, I am sure that the .NET framework has some implementation hidden somewhere, because as far as I know, when generating a COM object in .NET, and you don't supply an explicit GUID, then the .NET framework generates a name based UUID to create a well-defined ClassID and InterfaceID, i.e. UUIDs that don't change every time you recompile (like VB6). But this is probably hidden, so I guess you need to implement the algorithm yourself. Luckily, .NET provides both an MD5 and SHA1 algorithm so I don't think implementing a version3 and version5 UUID should be too difficult.

Pete
  • 12,206
  • 8
  • 54
  • 70
5

I think you have a misunderstanding of what a Guid actually is. There is no Guid representation of a string such as "Mehar".

The reason there is a new Guid(String s) overload is so that you can create a guid from a typical string representation of one, such as "00000000-0000-0000-0000-000000000000".

See the wiki article for more information on what a Guid actually is.

http://en.wikipedia.org/wiki/Globally_Unique_Identifier

Robin Day
  • 100,552
  • 23
  • 116
  • 167
4

In general there are few ways to make an universally unique ID (UUID RFC 4122, a.k.a. GUID). We could borrow these four from Python, and make in C# something alike:

uuid.uuid1([node[, clock_seq]])

Generate a UUID from a host ID, sequence number, and the current time. If node is not given, getnode() is used to obtain the hardware address. If clock_seq is given, it is used as the sequence number; otherwise a random 14-bit sequence number is chosen.

uuid.uuid3(namespace, name)

Generate a UUID based on the MD5 hash of a namespace identifier (which is a UUID) and a name (which is a string).

uuid.uuid4()

Generate a random UUID.

uuid.uuid5(namespace, name)

Generate a UUID based on the SHA-1 hash of a namespace identifier (which is a UUID) and a name (which is a string).

So if you need ID of a string as an object, not ID of a value, you should mangle your private UUID with given string, Your private UUID generate once using uuid1, and then use it as namespace for uuid3 or uuid5.

These variants and versions described on Wikipedia Universally_unique_identifier#Variants_and_versions

Community
  • 1
  • 1
user2622016
  • 6,060
  • 3
  • 32
  • 53
4

You cannot use GUID that way. The constructor of Guid expects a valid, string representation of a Guid.

What you're looking for is called a Hash function. (for example: MD5)

Kobi
  • 135,331
  • 41
  • 252
  • 292
4

Here is my own approach, I'm intentionally using String to hex dump if possible - visually it can be seen at least how big string is, and if needed - decoded using some online hex converter. But if string is too long (more than 16 bytes) - then using sha-1 to compute hash and generate guid from it.

/// <summary>
/// Generates Guid based on String. Key assumption for this algorithm is that name is unique (across where it it's being used)
/// and if name byte length is less than 16 - it will be fetched directly into guid, if over 16 bytes - then we compute sha-1
/// hash from string and then pass it to guid.
/// </summary>
/// <param name="name">Unique name which is unique across where this guid will be used.</param>
/// <returns>For example "{706C7567-696E-7300-0000-000000000000}" for "plugins"</returns>
static public String GenerateGuid(String name)
{
    byte[] buf = Encoding.UTF8.GetBytes(name);
    byte[] guid = new byte[16];
    if (buf.Length < 16)
    {
        Array.Copy(buf, guid, buf.Length);
    }
    else
    {
        using (SHA1 sha1 = SHA1.Create())
        {
            byte[] hash = sha1.ComputeHash(buf);
            // Hash is 20 bytes, but we need 16. We loose some of "uniqueness", but I doubt it will be fatal
            Array.Copy(hash, guid, 16);
        }
    }

    // Don't use Guid constructor, it tends to swap bytes. We want to preserve original string as hex dump.
    String guidS = "{" + String.Format("{0:X2}{1:X2}{2:X2}{3:X2}-{4:X2}{5:X2}-{6:X2}{7:X2}-{8:X2}{9:X2}-{10:X2}{11:X2}{12:X2}{13:X2}{14:X2}{15:X2}", 
        guid[0], guid[1], guid[2], guid[3], guid[4], guid[5], guid[6], guid[7], guid[8], guid[9], guid[10], guid[11], guid[12], guid[13], guid[14], guid[15]) + "}";

    return guidS;
}
TarmoPikaro
  • 4,723
  • 2
  • 50
  • 62
3

If op's intent is to create a UUID (Guid) from a string hash of some sort (MD5, SHA-1, et.c.), I found this very similar question with this great answer:

https://stackoverflow.com/a/5657517/430885

It has a link to a github-snippet based on RFC 4122 §4.3, that will create a Guid from a string and a namespace (which you can choose for yourself to guarantee against collisions from outside environments).

Direct link to the snippet: https://github.com/LogosBible/Logos.Utility/blob/master/src/Logos.Utility/GuidUtility.cs

Community
  • 1
  • 1
Frederik Struck-Schøning
  • 12,981
  • 8
  • 59
  • 68
-2

Guids are random, they are not intrinsically assigned to any string or other value.

If you need such linking, store the guids in a Dictionary and check for an existing guid first before creating a new one.

Skrim
  • 622
  • 6
  • 10
  • 3
    Guid's aren't completely random (or even mostly random, IIRC). They follow a strict format so that they can, indeed, but globally unique, not just "probably" unique :) – Noon Silk Feb 03 '10 at 09:34
  • 2
    -1: Only version 4 GUIDs are random. Version 3 GUIDs and Version 5 GUIDs are, in fact, intrinsically assigned to a string. – David Cary Aug 29 '12 at 17:37
  • @DavidCary Where are the GUID versions described? Are you sure you are not thinking about UUID? – Taemyr Aug 17 '16 at 09:01
  • @Taemyr: GUID versions are described in Wikipedia's ["Globally unique identifier"](https://en.wikipedia.org/wiki/globally_unique_identifier) article. Yes, I was thinking about UUIDs, but that article says "GUIDs and RFC 4122 UUIDs should be identical when displayed textually." and implies that each GUID version algorithm is the same as the corresponding UUID version algorithm. – David Cary Aug 17 '16 at 12:32
  • @Taemyr: ["Is there any difference between a GUID and a UUID?"](http://stackoverflow.com/questions/246930/is-there-any-difference-between-a-guid-and-a-uuid/6953207#6953207). – David Cary Aug 17 '16 at 13:30
  • @DavidCary True. But GUIDS are frequently a specific UUID implementation. In particular the .NET generatated Guid are version 4 GUID's and hence random. – Taemyr Aug 17 '16 at 13:30