11

With Apache httpd 2.2, it was possible to setup a reverse proxy and use mod_deflate for compressing proxied content, honoring Accept-Encoding: gzip headers.

This configuration was sufficient for getting it to work:

    LoadModule deflate_module modules/mod_deflate.so
    LoadModule filter_module modules/mod_filter.so
    SetOutputFilter DEFLATE

    LoadModule proxy_module modules/mod_proxy.so
    LoadModule proxy_http_module modules/mod_proxy_http.so
    ProxyRequests Off
    ProxyPass        /tomcat http://localhost:8880/
    ProxyPassReverse /tomcat http://localhost:8880/
    ProxyPass        /other  http://localhost:8001/
    ProxyPassReverse /other  http://localhost:8001/

Now after upgrading to 2.4 (2.4.29 on Windows), that same configuration is accepted, and it indeed compresses static content served from DocumentRoot. But the same content is returned uncompressed, when retrieved via ProxyPass.

I know that I can configure Tomcat to do the compression, but there is also this other server that just ignores Accept-Encoding headers.

How can I set up a reverse proxy, and have proxied content compressed?

Edit:

Here are the headers returned, demonstrating that proxied content is not compressed by the 2.4 server:

----- Retrieving uncompressed from DocumentRoot ---------------------------------

C:\Temp>curl -I http://localhost/test.txt 
HTTP/1.1 200 OK
Date: Tue, 09 Jan 2018 17:11:59 GMT
Server: Apache/2.4.29 (Win64) OpenSSL/1.1.0g
Last-Modified: Fri, 05 Jan 2018 12:58:40 GMT
ETag: "75441-5620701eb471c"
Accept-Ranges: bytes
Content-Length: 480321
Vary: Accept-Encoding
Content-Type: text/plain

----- The same from Tomcat ------------------------------------------------------

C:\Temp>curl -I http://localhost:8880/rr/test.txt 
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Accept-Ranges: bytes
ETag: W/"480321-1515157120042"
Last-Modified: Fri, 05 Jan 2018 12:58:40 GMT
Content-Type: text/plain
Content-Length: 480321
Date: Tue, 09 Jan 2018 17:11:59 GMT

----- 2.4.29: Retrieving compressed from DocumentRoot ---------------------------

C:\Temp>curl -I -H "Accept-Encoding: gzip" http://localhost/test.txt 
HTTP/1.1 200 OK
Date: Tue, 09 Jan 2018 17:11:59 GMT
Server: Apache/2.4.29 (Win64) OpenSSL/1.1.0g
Last-Modified: Fri, 05 Jan 2018 12:58:40 GMT
ETag: "75441-5620701eb471c-gzip"
Accept-Ranges: bytes
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 48265
Content-Type: text/plain

----- 2.4.29: Not getting any compression for proxied Tomcat content ------------

C:\Temp>curl -I -H "Accept-Encoding: gzip" http://localhost/tomcat/rr/test.txt 
HTTP/1.1 200 OK
Date: Tue, 09 Jan 2018 17:11:59 GMT
Server: Apache-Coyote/1.1
Accept-Ranges: bytes
ETag: W/"480321-1515157120042"
Last-Modified: Fri, 05 Jan 2018 12:58:40 GMT
Content-Type: text/plain
Content-Length: 480321

----- 2.2.14: Retrieving compressed from DocumentRoot ---------------------------

C:\Temp>curl -I -H "Accept-Encoding: gzip" http://localhost:81/test.txt 
HTTP/1.1 200 OK
Date: Tue, 09 Jan 2018 17:11:59 GMT
Server: Apache/2.2.14 (Win32)
Last-Modified: Fri, 05 Jan 2018 12:58:40 GMT
ETag: "90000000e7463-75441-5620701eb471c"
Accept-Ranges: bytes
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 48265
Content-Type: text/plain

----- 2.2.14: Proxied Tomcat content comes compressed ---------------------------

C:\Temp>curl -I -H "Accept-Encoding: gzip" http://localhost:81/tomcat/rr/test.txt 
HTTP/1.1 200 OK
Date: Tue, 09 Jan 2018 17:11:59 GMT
Server: Apache-Coyote/1.1
Accept-Ranges: bytes
ETag: W/"480321-1515157120042"
Last-Modified: Fri, 05 Jan 2018 12:58:40 GMT
Content-Type: text/plain
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 20

All of this was tested on a plain 2.4.29 installation downloaded from ApacheHaus. The above configuration has been added to httpd.conf, nothing else has been changed. The same applies to the 2.2.14 installation (downloaded in 2009 from Apache), but that one was additionally changed to port 81.

Gunther
  • 5,146
  • 1
  • 24
  • 35
  • What's the average size of the proxied response? – Oleg Kuralenko Jan 08 '18 at 20:52
  • Can you post larger samples? At least with DocumentRoot included and entire configuration inside VirtualHost (if you use them) and the headers returned by curl -D - 'http://target_url/tomcat' and curl -D - 'http://localhost:8880' – Oleg Kuralenko Jan 09 '18 at 04:57
  • @ffeast The size ranges from several KB up to several MB. Because of limited bandwith on my end, I am particularly interested in compressing the large ones. – Gunther Jan 09 '18 at 16:19
  • @ffeast For testing just this issue, I have used a fresh installation (with 2.4.29 downloaded from [here](https://www.apachehaus.com/cgi-bin/download.plx?dli=QVWp1TllWWz8EVj9SZFplcJVlUGRVYSVFVGtWN)). The question contains everything that I have modified in the configuration, so there is `DocumentRoot "${SRVROOT}/htdocs"` and no VirtualHost (at this time). Will update the question with response headers per your request. – Gunther Jan 09 '18 at 16:26
  • I've checked your setup on OSX and Ubuntu - both fine. Seems it's either a Windows-specific bug or something is missing in the description. Can you bring up your configuration somewhere in the cloud on linux to check if it works? If the problem reproduce it might take a look – Oleg Kuralenko Jan 10 '18 at 19:17
  • any updates? Did you try running it on *nix? – Oleg Kuralenko Jan 11 '18 at 18:17
  • @ffeast Well, I ran it on Ubuntu with Apache 2.4.18, but found the behavior to be the same as on Windows - no compression for proxied content, using the above configuration (`LoadModule` replaced by calls to `a2enmod`) – Gunther Jan 12 '18 at 22:07

2 Answers2

2

I’ve managed to reproduce curl + apache/tomcat behavior that you described

This is how I reproduced it (OS X El Capitan):

Tomcat:

docker run -it --rm -p 8880:8080 tomcat:6.0

Apache

httpd -v

Server version: Apache/2.4.18 (Unix)
Server built:   Feb 20 2016 20:03:19

httpd -l
Compiled in modules:
  core.c
  mod_so.c
  http_core.c
  prefork.c

Apache config (in its entirety):

Listen 80                                                                       

LoadModule authz_user_module libexec/apache2/mod_authz_user.so                  
LoadModule authz_core_module libexec/apache2/mod_authz_core.so                  
LoadModule access_compat_module libexec/apache2/mod_access_compat.so            
LoadModule filter_module libexec/apache2/mod_filter.so                          
LoadModule deflate_module libexec/apache2/mod_deflate.so                        
LoadModule mime_module libexec/apache2/mod_mime.so                              
LoadModule log_config_module libexec/apache2/mod_log_config.so                  
LoadModule headers_module libexec/apache2/mod_headers.so                        
LoadModule version_module libexec/apache2/mod_version.so                        
LoadModule proxy_module libexec/apache2/mod_proxy.so                            
LoadModule proxy_http_module libexec/apache2/mod_proxy_http.so                  
LoadModule unixd_module libexec/apache2/mod_unixd.so                            

<IfModule unixd_module>                                                         
User _www                                                                       
Group _www                                                                      

</IfModule>                                                                     

<IfModule mime_module>                                                          
    TypesConfig /private/etc/apache2/mime.types                                 
</IfModule>                                                                     

LogLevel debug                                                                  

<IfModule log_config_module>                                                    
    LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
    LogFormat "%h %l %u %t \"%r\" %>s %b" common                                

    <IfModule logio_module>                                                     
      # You need to enable mod_logio.c to use %I and %O                         
      LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O" combinedio
    </IfModule>                                                                 
    CustomLog "/private/var/log/apache2/access_log" common                      
</IfModule>                                                                     

ErrorLog "/private/var/log/apache2/error_log"                                   
TraceEnable off                                                                 

SetOutputFilter  DEFLATE                                                        

ProxyRequests    Off                                                            
ProxyPass        /tomcat http://localhost:8880/                                 
ProxyPassReverse /tomcat http://localhost:8880/                                 
ProxyPass        /other  http://localhost:8001/                                 
ProxyPassReverse /other  http://localhost:8001/                                 
DocumentRoot    /Library/WebServer/Documents 

Checking

curl -I -H 'Accept-Encoding: gzip'  'http://localhost/tomcat' 
HTTP/1.1 200 OK
Date: Sat, 13 Jan 2018 13:35:14 GMT
Server: Apache-Coyote/1.1
Accept-Ranges: bytes
ETag: W/"7454-1491118183000"
Last-Modified: Sun, 02 Apr 2017 07:29:43 GMT
Content-Type: text/html
Content-Length: 7454

curl -I -H 'Accept-Encoding: gzip'  'http://localhost/index.html.en' 
HTTP/1.1 200 OK
Date: Sat, 13 Jan 2018 13:35:25 GMT
Server: Apache/2.4.18 (Unix)
Last-Modified: Tue, 09 Jan 2018 04:51:20 GMT
ETag: "45-56250aa712200-gzip"
Accept-Ranges: bytes
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 65
Content-Type: text/html

As you can see, the output very closely matches your example

And here’s the fun part

If I use a regular GET request instead of HEAD (via browser or curl without -I) tomcat’s response DOES GETS GZIPPED

curl -D - -H 'Accept-Encoding: gzip'  'http://localhost/tomcat' 2>/dev/null | strings
HTTP/1.1 200 OK
Date: Sat, 13 Jan 2018 13:37:19 GMT
Server: Apache-Coyote/1.1
Accept-Ranges: bytes
ETag: W/"7454-1491118183000-gzip"
Last-Modified: Sun, 02 Apr 2017 07:29:43 GMT
Content-Type: text/html
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 2526
(some junk)

No idea why it’s happening though, looks like Apache’s + mod_proxy/defate misbehavior on HEAD requests. If you're saying it was OK in Apache 2.2 I would guess it might be somehow related to this adjustment

mod_deflate will now skip compression if it knows that the size overhead added by the compression is larger than the data to be compressed.

So I’d check whether the problem persists for GET requests in your case. If yes - provide even more details on your setup so that your environment could be 100% replicated - valid Dockerfile for apache and tomcat to isolate a likely environment discrepancy would be fine

Oleg Kuralenko
  • 11,003
  • 1
  • 30
  • 40
  • Great finding. I was indeed relying on the headers returned by a HEAD request, and failed to have a closer look at GET results. – Gunther Jan 15 '18 at 08:53
1

You are sending a HTTP HEAD request with the -I flag in curl. As the answer from ffeast suggests, this might be the cause of the issue.

If that is indeed the case, then it is either a bug or conscious disregard to the HTTP RFC: https://www.rfc-editor.org/rfc/rfc2616#section-9.4

9.4 HEAD

The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The metainformation contained
in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request.

If so, you should report this as a possible bug via this process: https://httpd.apache.org/bug_report.html

Community
  • 1
  • 1
Mindaugas Bernatavičius
  • 3,757
  • 4
  • 31
  • 58