24

We've noticed that from time to time we will get a HTTP request without a valid User-Agent string. Is there any valid real-world case for accepting this type of HTTP request?

Why wouldn't we auto block all IP's from which this type of request is received?

UPDATE My intention with the phrase "real-world" was to indicate that I am not asking what the HTTP protocol permits. It is permitted to submit HTTP requests without some headers. I am asking what "real-world" case you would have for allowing this type of HTTP request into your server.

Jay
  • 19,649
  • 38
  • 121
  • 184

4 Answers4

30

As stated in RFC 7231 (but nearly the same paragraph can be found in RFC2616):

5.5.3 User-Agent

The "User-Agent" header field contains information about the user agent originating the request, which is often used by servers to help identify the scope of reported interoperability problems, to work around or tailor responses to avoid particular user agent limitations, and for analytics regarding browser or operating system use. A user agent SHOULD send a User-Agent field in each request unless specifically configured not to do so.

The keyword here is SHOULD. And yes, there's an RFC that defines what that word is supposed to mean, RFC 2119:

  1. SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.

So, although the agents that do not send User-Agent do not follow what can be considered best practice, they do not violate any rule (rfc). So, in my opinion, there's not really a valid technical reason to block them.

fvu
  • 32,488
  • 6
  • 61
  • 79
  • 1
    Unfortunately, there are some webservers that will reject a request if the `User-Agent` is unknown/unrecognized, which can be annoying. Then you have to provide a custom `User-Agent` that mimics a well-known value that the webserver will accept. – Remy Lebeau Jun 17 '14 at 23:24
  • So your "real world" case where you might want to accept/trust HTTP requests without UserAgent header, is "you might want to allow obscure an bot to use your web site". – Jay Jun 18 '14 at 00:38
  • 1
    @JulianReschke Thank you for pointing to RFC7231, nice piece of work - I hope you don't mind that I did reintroduce a link to RFC2616 as today it's the one that is the norm. – fvu Jun 18 '14 at 09:14
  • 2
    fvu: RFC 2616 is obsolete. There's really no point in referencing it anymore, except for historical purposes. – Julian Reschke Jun 18 '14 at 12:03
  • @Jacob Let me generalize that like this: as de facto blocking these requests would go against [Postel's law](http://en.wikipedia.org/wiki/Robustness_principle) you should accept them unless you detect a strong correlation between types of activity or visitors you don't want on your server. – fvu Jun 19 '14 at 08:06
19

I guess many people use HTTP requests without a User-Agent mostly when they are using an API to perform the request.

Alwin
  • 465
  • 3
  • 5
  • Some bots (e.g. for less-than-mainstream search engines) may not set the user agent string, for example. Whether you want to accept them or not is entirely up to you. Normal user requests will almost always have a valid User-Agent (though I have seen people e.g. hack their iPod user agent to something like `Joe's iPod`) and they could also blank it out. – Eric J. Jun 17 '14 at 23:19
  • 7
    The `User-Agent` header is optional in RFC 2616. It *SHOULD* be used by clients, but it is not *REQUIRED* to be used. All of the major third-party browsers/clients use it, but custom apps/APIs/bots/etc might not. There are webservers that alter responses, or even reject requests (which is annoying), based on the `User-Agent` provided (or lack of one). – Remy Lebeau Jun 17 '14 at 23:21
  • So a "real world" case might be when you are providing an API, and you may not want to require the API users to include a User Agent string. – Jay Jun 18 '14 at 00:37
  • Yes, developers rarely make the User-Agent a mandatory field in the HTTP request while developing an API (unless they have a specific use case). – Alwin Jun 18 '14 at 16:40
  • Btw doesn't adding user agent to every request increased request size unnecessary ? – sktguha Aug 28 '20 at 16:41
  • 1
    @sktguha It likely increases the request size. However, in the context of the entire https request, the size of the user-agent header is negligible compared to, say, the overhead involved in a [TLS session](http://netsekure.org/2010/03/tls-overhead/). Additionally, [hpack](https://httpwg.org/specs/rfc7541.html) header encoding in HTTP/2 significantly reduces header sizes; for example, this [Cloudflare blog post](https://blog.cloudflare.com/hpack-the-silent-killer-feature-of-http-2/) demonstrates a reduction from 430 bytes to 4 bytes for the cookie and user-agent headers. 1/n – nishanthshanmugham Feb 07 '23 at 18:03
  • 1
    [continued] Considering the above, and the [RFC 7231](https://www.rfc-editor.org/rfc/rfc7231#section-5.5.3) "should" recommendation to send a user-agent header, it is generally a good idea to always include the user-agent header in requests. 2/2 – nishanthshanmugham Feb 07 '23 at 18:56
0

From my personal experience

The requests looks like this in the apache log if no user agent is set:

xx.xxx.xxx.xxx - - [10/Sep/2021:07:31:16 +0200] "GET / HTTP/1.0" 200 25485 "-" "-"

This one in specific conducted a malicious act.

I do not recommend blocking automatically but to remain attentive to this type of request.

Typewar
  • 825
  • 1
  • 13
  • 28
user2267379
  • 1,067
  • 2
  • 10
  • 20
-2

From my experience, there is not Legitimate use case, at least no common one.

From my server logs, all requests without a user agent are malicious. All come from bots.

In theory, someone who "really cares about privacy" could do this. It depends on what service you're providing. I run a couple of websites. People not using browsers are not my clients.

While it's true that it's trivial for a developer of malicious bots to add a user agent (fake one like Chrome), it doesn't mean it's not illegitimate in intention.

Jiulin Teng
  • 299
  • 3
  • 8
  • I have built a web app for booking people onto courses. I am logging the user agent so I can decide which browsers to continue to support. Problem is that about 5% of users are not sending back a ua string. But they are legitimate users because they make bookings and pay for them. Any ideas what browser might return empty ua string? I am using `var ua = Request.Headers["User-Agent"];` – Norbert Norbertson Mar 28 '22 at 10:44