1

Popular opinion seems to be that HttpClient should be used as a singleton:

If nothing else, the aim is to avoid the following: System.Net.Sockets.SocketException: Only one usage of each socket address (protocol/network address/port) is normally permitted. (If I'm not mistaken, this is a consequence of TCP connections being left open for 240 seconds.)

However...

It is considered good practice to use stateless services, particularly (but not exclusively) when doing TDD or DDD.

Our friend HttpClient, however, is not stateless. Most prominently, the use of ClientCertificates or ServerCertificateCustomValidationCallback cannot be specified on each individual request, but rather through the client's constructor, public HttpClient(HttpMessageHandler).

So having a single HttpClient is impossible if we ever need either of the above. Alternatively, having an instance per distinct { protocol, host, port } may be possible, but forces us to bend over backwards in managing these instances, not to mention resting on the assumption that each call to such an endpoint uses the exact same certificates/validation.

Indeed, HttpClient is forcing us to pick our poison:

  • Use a limited number of HttpClients and introduce state in our services, or
  • Instantiate HttpClients on the fly and risk running into the dreaded SocketException.

How do we have our cake and eat it too? How can we keep our services stateless without HttpClient braying like a donkey?

Edit: Let it also be noted that long-lived HttpClient instances do not respect the DNS Time To Live (TTL) setting, i.e. its connections never get DNS updates.

Timo
  • 7,992
  • 4
  • 49
  • 67
  • I have use two httpclients when scrapping webpages. When the main webpage has links to other webpages is is much more efficient to keep one client on the main page and use second client to scrap the links. – jdweng Aug 16 '18 at 16:29
  • 1
    Message handlers define a pipeline, so you could potentially define logic in a delegating handler to construct the handler you need - in the handler code itself or by short-circuiting the pipeline before unwanted handlers have a chance to attach. Probably also more trouble than it's worth, but something to consider. – Crowcoder Aug 16 '18 at 17:10
  • @Crowcoder I'm not sure what you mean. Did you realize that the handler must be passed to the `HttpClient` constructor? I.e. a new handler requires a new `HttpClient` instance. – Timo Aug 17 '18 at 08:03
  • That's not quite accurate. You can pass several handlers to it and any of them can be delegating handlers where you define the logic which means you could conditionally build the Pipeline at runtime. – Crowcoder Aug 17 '18 at 09:41
  • @Crowcoder Passing several handlers would be very interesting. Can you demonstrate how? – Timo Aug 17 '18 at 14:36
  • 1
    @Timo [here's some info on client-side handlers](https://learn.microsoft.com/en-us/aspnet/web-api/overview/advanced/httpclient-message-handlers). The [server-side](https://learn.microsoft.com/en-us/aspnet/web-api/overview/advanced/http-message-handlers) is also applicable and has some more detail. – Crowcoder Aug 17 '18 at 14:49
  • Can we think about `HttpClient` as it is an 'SqlConnection'? These are the same things from perspective of your problem. So the solution should be the same - some kind of pooling. – poul_ko Aug 21 '18 at 15:33
  • @poul_ko ADO.NET's pooling of SQL connections is excellently done: **you simply instantiate the `DbConnection` within a `using` statement, and the pooling is done under the hood**. It is a joy to work with. The whole problem is that `HttpClient` can _not_ be used in this way, but instead wants you to manage a long-lived instance yourself - along with its own set of problems because of its state. It would be a _dream_ if `HttpClient` worked like `DbConnection`. – Timo Aug 22 '18 at 11:22
  • @Timo, DbConnection and pooling shows the way we should implement own mechanics for HttpClient. Since there is no ready-made HttpClient pooling maybe craft your own and share? – poul_ko Aug 23 '18 at 12:08
  • @poul_ko Yes, the dream :) For now we'll have to do with the one I have shared in my answer below, https://stackoverflow.com/a/51968118/543814. – Timo Aug 23 '18 at 13:35

4 Answers4

0

One option is to use the old HttpWebRequest. It has fitting "one per request" semantics and cannot even be used as a singleton.

Sadly, it does not support HTTP/2, and may fall further behind in the future.

A small compensation for the lack of HTTP/2 support is the fact that HTTP/2's advantages have a much more significant impact on typical browser use cases than on small API calls (small, few headers, no external resources).

Timo
  • 7,992
  • 4
  • 49
  • 67
0

I don't quite think that the HttpClient, and the fact that it is not stateless, really has an impact on the domain per se. I would argue against having those types of interactions in the domain :)

That being said it may be an option to abstract those details away in any event. If you need a distinct singleton HttpClient I would go with an IHttpClientProvider.Get({the kind I want}) implementation that returns the relevant client after instantiating it, if necessary. Once instantiated it sticks around as a singleton and would be disposed of when the application terminates.

In such a scenario you should be able to handle your testing using some mocks. You may even go so far as to abstract the HttpClient with a relevant interface and return that interface from the IHttpClientProvider.Get() method.

update: I did a video on this if you'd like to take a look.

Eben Roux
  • 12,983
  • 2
  • 27
  • 48
  • I like this idea. The question then becomes: How do we identify the distinct kinds? So far, the unique combination that we have is { Protocol, Host, Port, ClientCertificates, _ServerCertificateValidationCallback_ }. The first four have fairly trivial equality checks. How about the last one? If two callers use the same method, it's the same reference. If two callers use identical lambdas, they may use different instances, but that's acceptable. **What if a caller uses a lambda that captures variables?** My guess is that such ServerCertificateValidationCallbacks will be considered unequal... – Timo Aug 17 '18 at 08:14
  • I forgot Timeout in my previous comment. That is a bummer. Different timeouts leading to different instances is not ideal. Still, an instance per incoming reference could be acceptable, if we can get the lambda-with-captured-variables to work right somehow. – Timo Aug 17 '18 at 08:16
  • 1
    Most problems can be solved with another layer of indirection :) --- perhaps the *ServerCertificateValidationCallback* can be implemented to call into some `IServerCertificateValidationCallbackProvider.Get({the kind I want})`. I know this gets tricky but I don't see quite another way other than having something that truly is common across your use-cases. The `timeout` seems like something that could be common else there appears to be *some* guidance around that in the `HttpClient.Timeout` documentation. To pick a "*kind*" I may go with a named type of sorts (enum?) rather than compute it. – Eben Roux Aug 17 '18 at 09:44
  • I suppose we can solve the callback issue, if we user this layer of indirection to rob the client of the ability to use a lambda (instead forcing them to retrieve any required resources, like a pinned certificate or fingerprint, from the callback provider). It is not ideal, but it could work. – Timo Aug 17 '18 at 14:33
0

I just found this:

Let IHttpClientFactory manage HttpClient instances.

According to this blog post, it was created for precisely our dilemma. I have yet to discover how it handles client certificates and certificate validation callbacks.

Timo
  • 7,992
  • 4
  • 49
  • 67
0

Based on Eben Roux's answer and the comments there, as well as Microsoft's IHttpClientFactory, we can create a class that combines good caching, very easy default usage, and fairly easy customization.

As it turns out, it is the HttpMessageHandler (rather than the HttpClient) that ought to be cached correctly: we should reuse it as much as possible, but for no more than a few minutes (Microsoft's IHttpClientFactory implementation seems to use 2 minutes), because of potential DNS updates.

First, we allow the client to specify a "purpose", combining a custom HttpMessageHandler factory with a unique name. Wherever the client wants to use a custom HttpMessageHandler (such as for client certificates), it represents it in a purpose. Purposes with the same name are considered equal and exchangeable, allowing us to do caching.

public class HttpClientPurpose
{
    public override string ToString() => $"{{{this.GetType().Name} {this.UniqueName}}}";
    public override bool Equals(object other) => other is HttpClientPurpose typedOther && String.Equals(this.UniqueName, typedOther.UniqueName, StringComparison.OrdinalIgnoreCase);
    public override int GetHashCode() => StringComparer.OrdinalIgnoreCase.GetHashCode(this.UniqueName);

    internal static HttpClientPurpose GenericPurpose { get; } = new HttpClientPurpose("InternalGenericPurpose", () => new HttpClientHandler());

    public string UniqueName { get; }
    public Func<HttpMessageHandler> MessageHandlerFactory { get; }

    public HttpClientPurpose(string uniqueName, Func<HttpMessageHandler> messageHandlerFactory)
    {
        this.UniqueName = uniqueName ?? throw new ArgumentNullException(nameof(uniqueName));
        this.MessageHandlerFactory = messageHandlerFactory ?? throw new ArgumentNullException(nameof(messageHandlerFactory));
    }
}

The client then uses an implementation of our custom INetHttpClientFactory (avoiding a naming conflict with the one from Microsoft) to get an instance according to their wishes.

/// <summary>
/// Provides instances of the System.Net.Http.HttpClient.
/// </summary>
public interface INetHttpClientFactory
{
    /// <summary>
    /// <para>
    /// Returns a generic HttpClient, with no exotic options like client certificates or custom server certificate validation.
    /// </para>
    /// <para>
    /// Returns a client that is valid for at least two minutes. Behavior is undefined if it is used beyond that time.
    /// </para>
    /// </summary>
    HttpClient CreateClient();
}

/// <summary>
/// <para>
/// Caches HttpClients or their handlers by purpose, as resource-efficiently as possible, while still allowing fairly easy customization, such as client certificates or server certificate validation.
/// </para>
/// </summary>
public interface IPurposeCachedHttpClientFactory : INetHttpClientFactory
{
    HttpClient CreateClient(HttpClientPurpose purpose);

    HttpMessageHandler CreateHandler(HttpClientPurpose purpose);
}

public class PurposeCachedHttpClientFactory : IPurposeCachedHttpClientFactory
{
    private static IMemoryCache CachedClients { get; } = new MemoryCache(new MemoryCacheOptions());
    private static IMemoryCache ExpiredClients { get; } = new MemoryCache(new MemoryCacheOptions());

    private static readonly TimeSpan ClientLifetime = TimeSpan.FromSeconds(240); // Match the time TCP connections are kept open, for symmetry if nothing else (must not be less than two minutes for reasonable use)

    /// <summary>
    /// <para>
    /// Returns a generic HttpClient, with no exotic options like client certificates or custom server certificate validation.
    /// </para>
    /// <para>
    /// Returns a cached client that is valid for at least two minutes. Behavior is undefined if it is used beyond that time.
    /// </para>
    /// </summary>
    public HttpClient CreateClient()
    {
        return this.CreateClient(HttpClientPurpose.GenericPurpose);
    }

    /// <summary>
    /// <para>
    /// Returns a customized HttpClient, whose message handler is determined by its purpose.
    /// </para>
    /// <para>
    /// Returns a cached client that is valid for at least two minutes. Behavior is undefined if it is used beyond that time.
    /// </para>
    /// </summary>
    public HttpClient CreateClient(HttpClientPurpose purpose)
    {
        var messageHandler = this.CreateHandler(purpose);
        return new HttpClient(messageHandler, disposeHandler: false); // Essential to NOT dispose the handler when disposing the client
    }

    /// <summary>
    /// <para>
    /// Returns a cached HttpMessageHandler determined by the purpose.
    /// </para>
    /// <para>
    /// It is recommended to use the CreateClient() method, unless direct access to the handler is needed.
    /// </para>
    /// <para>
    /// Returns a cached handler that is valid for at least two minutes. Behavior is undefined if it is used beyond that time.
    /// </para>
    /// </summary>
    public HttpMessageHandler CreateHandler(HttpClientPurpose purpose)
    {
        var messageHandler = this.CreateMessageHandler(purpose.UniqueName, purpose.MessageHandlerFactory);
        return messageHandler;
    }

    private HttpMessageHandler CreateMessageHandler(string uniqueName, Func<HttpMessageHandler> messageHandlerFactory)
    {
        // Try to use a cached instance
        return CachedClients.GetOrCreate(key: uniqueName, factory: cacheEntry =>
        {
            cacheEntry.AbsoluteExpirationRelativeToNow = ClientLifetime;
            cacheEntry.RegisterPostEvictionCallback(DidEvictActiveClient);
            return messageHandlerFactory();
        });
    }

    /// <summary>
    /// Schedules expired clients to be disposed (via the cache of expired items) after they are evicted from the cache of active clients.
    /// </summary>
    private static void DidEvictActiveClient(object key, object value, EvictionReason reason, object state)
    {
        // Schedule it to be disposed
        ExpiredClients.GetOrCreate(key: value, factory: cacheEntry =>
        {
            cacheEntry.Priority = CacheItemPriority.NeverRemove;

            // Eventually dispose it
            cacheEntry.AbsoluteExpirationRelativeToNow = ClientLifetime;
            cacheEntry.RegisterPostEvictionCallback(DidEvictExpiredClient);

            System.Diagnostics.Debug.Assert(cacheEntry.Key != null);
            return cacheEntry.Key;
        });
    }

    /// <summary>
    /// Disposes expired clients after they are evicted from the cache of expired items (i.e. after a delay).
    /// </summary>
    private static void DidEvictExpiredClient(object key, object value, EvictionReason reason, object state)
    {
        // TODO: Put this in a try/catch after confirming that it works without exceptions for some time
        ((HttpMessageHandler)value).Dispose();
    }
}

Points of interest:

  • If the client needs no customization, they need not concern themselves with the HttpClientPurpose at all.
  • The HttpClient is always new (and may be disposed by the client). It can be customized for their single usage, such as with a custom timeout.
  • We keep a handler in the active cache for 4 minutes.
  • After a handler expires, we keep it in the 'expired' cache for 4 minutes. After that, we dispose it. If a client was using it for longer, its activities may be interrupted. Use cases that take that long will have to use a different approach. This could be avoided by using a more complex cleanup approach, such as Microsoft has done with their implementation of IHttpClientFactory, but that was beyond our scope here.
  • As far as I'm aware, Microsoft's implementation requires the use of configuration to get handlers with custom properties. We have avoided that altogether, using the simple HttpClientPurpose class. It can be used from wherever suits the client code.
  • The documentation states a limit of 2 minutes. The implementation currently uses 4 minutes, but is free to change as long as it satisfies the documentation.
Timo
  • 7,992
  • 4
  • 49
  • 67