2

To put the question into some context, the system exposing the web service uses GUIDs internally as identifiers for all entities.

In such case, when designing a public facing data integration web service (used mainly for importing or exporting data from the system to other, external systems), what would you consider as pros and cons of using the internal identifiers in the interface of the service?

With such solution, the export methods of the web service would return the dto's identified with GUIDs and the import methods would accept similar dto's - I assume the external system would be responsible for generating new GUIDs for new entities.

What are the potential issues one might face with this approach?

The technology stack for the system and web service is .NET / WCF / SOAP

Michał Drozdowicz
  • 1,232
  • 11
  • 30
  • 1
    Unless you're very lucky with your data, you have to come up with 1) an alternative set of identifiers (might or might not have to be human-readable) which 2) are unique and 3) have to be translated to and from your internal ones. If they don't have to be human-readable, and assuming you have too much data to manually choose new IDs and can't guarantee uniqueness of other data fields, I see no problem with using a big (say, 128-bit? :P) number as your external ID. – anton.burger Jun 03 '11 at 21:28

1 Answers1

2

First, let's look at the more generic "how do I set up a public API" question, my first exercise is determining what information is needed by the consumer of the service. I also look and see if there are is company specific naming in the object model. I then create a service model (data contract, if you want WCF specific) that matches the view I want to show the consumer. This includes a unique key, which is more often a SKU string (human readable key) than a GUID/int (the actual derived primary key), as the SKU is public and the means of storing in the database is not. So, in general, I would not expose these primary key concepts, if that is what the GUID is.

Now to the question of "do you see problems with this approach". I will focus on more general concepts so you can make a more informed decision, as there is no 100% right/wrong answer.

As long as this is machine to machine and the use of the GUID is something both systems are aware of, I see nothing particularly scary about this approach. If this ultimate goes to a human readable system where the GUID has to be interacted with, then you have an issue.

One potential issue with the system is exposing your own primary key information to customer or client systems, who don't have to understand this level of detail. If this is actually "semi-public" with a select list of vendors, the "risk" might be less. This is the primary issue I see.

One could argue the weight of the GUID (128 bits) versus a smaller identifier, but this is a bogus answer, IMO, as the network latency more than outweighs sending a few more bytes as a request parameter.

Gregory A Beamer
  • 16,870
  • 3
  • 25
  • 32
  • Thanks for the thoughtful answer. Indeed this service is not open for public, but will be used strictly within the customer's IT infrastructure. In the vast majority of cases (probably excluding support work) it will be machine-to-machine. I agree that in general exposing such private and low level data is not a good practice, but maybe in a closed environment the issues might not be that severe. – Michał Drozdowicz Jun 06 '11 at 09:18