55

I monitor 404s on my sites closely which helps me detect broken links and hacking attempts but I've recently been getting log spam from browsers with these strings in the User Agent. They seem to be trying to scan parent directories of valid resources but directories have special meaning to my sites due to SEO rewriting.

Before I decide what to do about it I'd like to know what these UAs are trying to do and why. If it's just "noise" I'd be happy to drop the connection entirely otherwise if they do something useful I could provide an appropriate response.

I believe some of the requests are from my clients so I can't do anything too disruptive, as much as I'd like to.

Kara
  • 6,115
  • 16
  • 50
  • 57
SpliFF
  • 38,186
  • 16
  • 91
  • 120
  • 1
    This is an interesting question but it seems to me it might be better suited for ServerFault. – JYelton Nov 10 '11 at 01:25
  • 8
    Well the website code was written entirely by me. Any solution is going to be in the form of code. I wouldn't see this as a server administration issue. – SpliFF Nov 10 '11 at 01:27

5 Answers5

29

Microsoft has a kb article (link currently broken, Internet Archive snapshot) that covers Protocol Discovery in fine detail. Essentially, Office is trying to determine if your server supports WebDAV (or something like it) so that changes the user makes to the Office document can be pushed back directly to the server.

bmm6o
  • 6,187
  • 3
  • 28
  • 55
  • 1
    Thanks for the link. Seems the issue is my rewrite rules pass OPTIONS requests to my application when they would be better handled by apache. That solves the Office Protocol Discovery issue however the OfficeLiveConnector is doing GET requests so maybe it's not the same thing. – SpliFF Nov 10 '11 at 01:40
  • 1
    I read the link in full but I'm still unclear whether requests from the UA "Microsoft Office Protocol Discovery" indicate the user is browsing with IE or an Office application like Word. Is IE an "Office Application"? – SpliFF Nov 10 '11 at 01:44
  • 5
    In my experience, they have downloaded the document with a browser (not necessarily IE) and the browser launches the registered handler for that document type. It's then Office itself that's making the additional queries. You should see this reflected in your logs. – bmm6o Nov 10 '11 at 16:35
  • the link now seems down – FGRibreau Nov 11 '19 at 15:50
12

On servers I have to maintain, this seems to occur due to html e-mail using external images hosted on our servers.

It looks like Microsoft Office Outlook clients, which uses Microsoft Word for editing e-mail (and for viewing them since 2007 edition), trigger those "Microsoft Office Protocol Discovery" requests.

In my case, web sites without any kind of online contribution, I see that as annoying noise. If your site is some kind of sharing site with documents editing capabilities, you may not consider those requests as annoying noise, depending on your site implementation.

Frédéric
  • 9,364
  • 3
  • 62
  • 112
6

This worked for me:

# Intercept Microsoft Office Protocol Discovery
RewriteCond %{REQUEST_METHOD} ^(OPTIONS|PROPFIND)$ [NC]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft\ Office\ Protocol\ Discovery [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft\ Office\ Existence\ Discovery [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft\-WebDAV\-MiniRedir.*$
RewriteRule .* - [R=501,L]
jakobdo
  • 1,282
  • 14
  • 20
6

I assume your site serves the occasional office document - that's what is root cause of this issue normally. You can probably avoid the calls by telling office not to bother trying to find out if saves are possible.

This can be done by amending the Content-Disposition header in the served office document. I had this problem when it was set to:

Content-Disposition=inline; filename="<my file name>"

By changing it to Attachment, the calls were avoided:

Content-Disposition=Attachment; filename="<my file name>"
Tom Parsons
  • 61
  • 1
  • 1
5

I host a regular, non-WebDAV, non-SharePoint website. Because of Microsoft Office Protocol Discovery I see many errors from failed requests in my log files. To eliminate these I recommend disabling WebDAV style requests by disabling the HTTP methods it uses, beginning with OPTIONS. This also increases security by avoiding certain HTTP attacks.

I am using Apache 2.4, I recommend the following in httpd.conf:

<Location />
    # block HTTP methods: OPTIONS PUT DELETE TRACE CONNECT PATCH
    AllowMethods GET POST HEAD
</Location>

See Apache 2.4 Reference

Community
  • 1
  • 1
Steve Jones
  • 1,528
  • 19
  • 12
  • 11
    Beware, those verbs have other uses than WebDav. A rest API requires most of them. [CORS](http://www.w3.org/TR/cors/) requires `OPTIONS` verb. (A site supplying content to pages of other sites (through ajax) needs to support CORS.) – Frédéric Sep 25 '15 at 19:58