2

When is an API NOT RPC ?

After a long long discussion on twitter regarding API design. I want to try to come to a clear answer when an API is not RPC based.

There seems to be quite a bit of confusion around this Is That REST API Really RPC? Roy Fielding Seems to Think So

That specific link is about REST and RPC. my question just aims to ask about RPC vs not RPC in general, not in the context of HTTP.

The TLDR; definition of when an API IS RPC is "when it hides the network from the developer" Fair enough.

But back to when is it not.

The Twitter duscussion focused on the HttpClient, as not being RPC.

My argument is that the HttpClient does not really model the network. There is nothing in there modelling latency. There is also a very weak representation of network failures. Most status codes in HTTP are not network related, e.g. NotFound, Payment Required etc.

Depending on language and SDK, the HttpClient may take a string or an URI. In the case where it takes a string representing the URI, that also doesn't really communicate that there is a network involved, you would need to know what the string represents and whant an URI is.

If you had never seen HTTP before, how would you know by looking at that specific API? There is an API consuming a string, and returning an object either containing a string or some int response codes.

How would you know?

Most developers of-course knows what HTTP is, but from a strict definition point of view, how would you define an API as not RPC?

Would a checked exception in Java "NetworkException" be enough to clearly communicate that there is a network involved? If the functions you are calling have descriptive names like "SayHelloOverNetwork" would that be enough to make it not RPC?

And exactly when is a procedure a procedure in remote terms? All network communication will result in some code being run on the receiving system.

If we take a person who have never been in touch with technology or protocols, and teach this person a programming language. What would be the definition of "Not RPC" this person could employ to spot what is RPC and not?

I am being dense and possibly silly here, but I am trying to find the essence of "Not RPC", what exactly is required for this?

Does it for example need to be non blocking in order to play nice with unknown latency? Or could a non RPC API be blocking? IMO, if it is blocking, it hides that there will be latency involved, and thus falls into the "Latency is zero" fallacy, meaning it hides the network.

This all seems to be super obvious to everyone else but me, but no one have yet shown a concise answer on what the requirements for not being RPC is.

Community
  • 1
  • 1
Roger Johansson
  • 22,764
  • 18
  • 97
  • 193
  • take a look at [windows API](https://en.wikipedia.org/wiki/Windows_API) - it is not RPC, though it gives you possibility to call remote procedures via its own protocol (non http based) – Iłya Bursov May 19 '17 at 22:34
  • Most of the criteria you mention have nothing to do with RPC. Have a look at CORBA and Sun-RPC. The essence of RPC is, err, Remote Procedure Call. Procedure call semantics to a remote host. Nothing about latency network modeling, ... It is supposed to hide all that from you. Your question appear to be based primarily on your own opinion of what RPC is, and it's wrong. – user207421 May 20 '17 at 00:09
  • The question was what _is not RPC_, that is, if RPC is supposed to hide network semantics from you, then an API that is not RPC should be the inverse, being explicit about network semantics. no? – Roger Johansson May 20 '17 at 00:15
  • @RogerAlsing No, it would be non-remote, i.e local, or have non-procedure-call semantics. – user207421 May 20 '17 at 00:59
  • @EJP and that is still the question, how does the "non-procedure-call semantics" look? calling `httpClient.Get(...)` how do you know that that is not a remote proc called "Get" on the other end, if you didn't know about HTTP when looking at that specific API? – Roger Johansson May 23 '17 at 09:43

3 Answers3

3

For me it looks like you're putting all eggs into one basket. Lets start from the very beginning:

API is application programming interface

it is a set of clearly defined methods of communication between various software components

The only definitive property of API is that it is defined/documented way of communication.

If one application/module/component A can use/call another application/module/component B somehow - it means that B provides API and A uses this API.

Usually there are two aspects of API which must be defined:

  1. What exactly is passed into/returned from component (its your application logic)
  2. How exactly data is transferred/serialized (this is technical implementation)

I'm not touching "what" part for obvious reasons.

Lets focus on different ways of "how":

  • push 4 byte integer to stack, change IP register and read 4 byte integer output from EAX register
  • connect to socket, write data serialized as byte array and read some response as byte array back
  • call 911 from any phone, say your address and expect several cars on your driveway

In all these cases you're using some API = you're communicating with other components in some predefined way.

RPC is remote procedure call

computer program causes a procedure (subroutine) to execute in another address space

The only definitive property of RPC is that data is passed across different/remote address spaces, some architecture allow different address spaces on single host, for example x86. As soon as different physical hosts usually do not share address space, any call across network is RPC, but not vice versa.

Note: It is rare, but possible to share memory space across different physical hosts connected in network - then such communication strictly speaking is not RPC, lets omit such cases.

Any RPC call automatically means that you're calling some API. By definition. So we can say that RPC is part of API's "how", it is transport level. As soon as RPC itself does not define actual mechanism, there are could be very different implementations, for example shared memory, DMA, TCP/IP, etc.

I suppose now you can answer your question when an API is not RPC based - When API says so. It is up to API developer to define whether it should/can be called via RPC or not, API can define multiple ways of calling it.

As antonym to RPC you can use "in-process".

So, phrase API IS RPC is "when it hides the network from the developer" is absolutely non-sense. API must define "how" section.

HTTP is hyper text transfer protocol

request–response protocol in the client–server computing model

The only definitive property of HTTP is that it describes protocol/format of request (input arguments) and response.

You can use HTTP in non-RPC API. For example I prefer to think about HTTP as file format. So we can say that HTTP is another part of API's "how". It defines serialization part of API, but does not dictate you transport level.

Note: Some RPC protocols actually define both transport and serialization.

So, HttpClient is the tool which allows you to invoke API and use HTTP encoded request/response, usually such library supposes RPC as transport level. None of these terms mandates network or any particular transport protocol. This is why http client should not declare any kind of network exceptions/errors, but it could throw HTTP errors as exceptions.

Note: Network exceptions could be thrown from TCP/IP RPC implementation for example, HTTP client library could proxy them to you. Unfortunately some libraries wrongly couple HTTP with TCP/IP too much and border between different responsibilities is crossed.

REST is representational state transfer

architectural style for distributed hypermedia systems

It is very wide term, it defines a lot of things from different aspects and at very different levels, most important:

  • HOW your API should be designed (usage of URI)
  • HOW your API should be implemented (stateless, HTTP verbs)
  • HOW your API should be called (client-server)

Client-server, usually assumes cross-process communication. Ie different address spaces, this is why we can say that REST mandates RPC as invoking mechanism to API + HTTP as serialization format.


Now, I suppose you will be able to understand these answers:

How would you know?

when you give client your REST API, it automatically defines some part of your API in terms of other protocols. Ie - to use REST API you must read/know HTTP protocol first. API must define what kind of RPC must be used.

HttpClient does not really model the network.

It should not do this. It works with HTTP semantics.

you would need to know what the string represents and whant an URI

There are two URIs actually:

  • URI is part of API's "What" section, it defines business object location. It has nothing to do with network or DNS system. You should not understand it.
  • URI could be part of TCP/IP RPC requirements, in this case it represents domain name/path. But some implementations can work with IP addresses, not URI.

If the functions you are calling have descriptive names like "SayHelloOverNetwork" would that be enough to make it not RPC?

As I wrote before - we can assume network as RPC always.

Does it for example need to be non blocking in order to play nice with unknown latency?

Its up to API developer:

  1. API can contain functions which suppose asynchronous execution, but client can call them in blocking way
  2. API can contain blocking functions, but client can call them asynchronously
Iłya Bursov
  • 23,342
  • 4
  • 33
  • 57
0

An RPC is a network API with enough layers of abstraction on top. What's enough layers? That's a subjective thing.

Whether an API is an RPC or not doesn't, in my opinion, depend on method names or the names of exceptions/errors you need to handle. We're adults, we know that the call is done over the network. Naming it "SayHello" instead of "SayHelloOverNetwork" doesn't make a difference.

What does make a difference is all of those layers of abstraction - the less resource management and error handling you need to do, the more RPC-ish the code.

And about the specific HttpClient example - I'd say that today, developers doing high-level work consider HTTP to be a transport medium; an alternative to "plain old sockets".

So while a person who does not know what HTTP is might look at "a function that takes a string and returns an error code" as RPC, a modern developer would probably see it as "nitty gritty networking code". He would then say yuck, and put a few more layers of abstraction on top to make method calls from his business logic more RPC-ish. That way, the business logic would have to handle only the most extreme failures.

Malt
  • 28,965
  • 9
  • 65
  • 105
0

A) If the documentation told you how to do what you want by calling the API, then you are calling an RPC-like API, not a REST API.

B) If the API itself told you how to call it to accomplish what you want, then that API is not very RPC-like, and might be a REST API.

Programming in an object oriented or a procedural style is like case (A) -- you know what to call and how to call it to accomplish what you want before you write the code to call it. When [I assume Roy F.] says that an RPC-like API hides the network from the developer, he means that the developer can continue to program in this way whether or not his calls are remote -- he doesn't have to care about the network.

When you call a REST API, however, you have to program differently, because you have to let the API tell you what you can do and how you can do it. That's what it means to be "hypertext-driven".

Being hypertext-driven means that your stuff will continue to work when the guy on the other side of the network, who doesn't know or care about your program at all, completely changes what you can do and how you have to do it. Note that the lack of any contract between you and the system you are calling is the fundamental feature of the network that RPC hides.

Matt Timmermans
  • 53,709
  • 3
  • 46
  • 87