2

Background.

I'm tasked with debugging some PHP and JavaScript code designed to pull static, gzip'ed JSON files from the host server, and manipulate the resulting JSON object's parameters.

Apologies in advance for my misuse of terminology. I have some experience with software development, but very little with web/server development (and almost none with PHP/JavaScript).


Code.

To "pull" the .json.gz file from the host server, jQuery's Ajax function is used:

function myJsonReader() {
    var myJsonFileUrl = getUrl();
    
    function doFunStuffWithJsonFile(jsonObject) {
        // Fun things happen
    }

    $.ajax({
        url: myJsonFileUrl,
        type: "GET",
        dataType: "json", 
        success: doFunStuffWithJsonFile,
        error: function(jqxhr, status, exception) {
            console.log("Exception " + exception);
             alert('Failure:', exception);
             
        }
    }); 

}

Server-side, we have an index.php file, the first few lines of which are:

<?php
  header('Accept-Encoding: gzip');
  header('Content-Encoding: gzip');
?>

<html>
<head>
... some HTML code

The Ajax GET request fails. I know the file exists, and I'm passing the absolute path. Moreover, I'm able to read the compressed file if I use dataType: "text", but that, of course, returns "garbage":

����%^��0��jN�L/��8�?@��x���351�g��+��~���ׯ���/߿�7���^��di9�y;jY��6�$K�=��4��QTMB�^or��M�P��̡�*}��G�t��K!#�8�Z@[d�9�#����R��N�����y��Պ�������;9�T����B+��VM�#��.:�<ĩ�F�PZ��Ȕl[K̔N[� GȡĚ�.5;P6�H��jeͮ��<�e""�h�-!�>��S�E��Q�m�H�.ڌSAyc�S�MsFHF�K]�H)/ry`�6.��&��-ME���s�����GA��@�rJ�.����)��kR�Vi�6��h�K-`��������

If I check the XHR response using:

error: function(xhr, status, error) {
        console.log(xhr.responseText);
        }

I also get garbage.

The file is a valid JSON file, readable by Vim with it's built-in gunzip capabilities. Moreover, this code was previously able to read these gzipped JSON files without issue using an Ajax call and without the use of external libraries, until some unknown change is presumed to have been made (being messy and mostly undocumented, it hasn't been possible to track down that change).

It is not possible to simply unzip all files we wish to read with this function, nor is it possible to make use of external libraries (e.g. zlib), unfortunately.


Several sources suggest this is should be relatively straight-forward (e.g. this StackOverflow post, and this one). Sadly, it isn't clear to me what I'm missing.


Edit.

After re-loading the page, I check the Network tab in Chrome's Inspector. I see the Fetch/XHR request for my JSON file. The headers for which include the following:

Requested Method: GET
Content-Type: application/x-gzip
Accept: application/json, text/javascript, */*; q=0.01
Accept-Encoding: gzip, deflate
Status Code: 200 OK

Which, to my untrained eye, seems to be properly configured for accepting compressed files.

10GeV
  • 453
  • 2
  • 14
  • So you are trying to read the file in JavaScript? – epascarello Jun 29 '22 at 02:07
  • Yes, I then perform various operations on the JSON file's attributes in JavaScript (e.g. operations on `jsonObject.attributeName`). Still reading through the rest of the code to determine the scope of what is done with these, but I'll need to access the attribute values in JS. – 10GeV Jun 29 '22 at 02:11
  • Having `header('Accept-Encoding: gzip');` in your `index.php` seems strange, since I thought the [Accept-Encoding header was for clients to specify what encoding they can understand](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Encoding). Anyway, did you try adding `headers : {'Accept-Encoding' : 'gzip'},` to your AJAX request, as suggested in the answers you referenced? What was the result? – kmoser Jul 01 '22 at 21:16
  • @kmoser I agree that having `header('Accept-Encoding: gzip')` in `index.php` seems odd, although this is well outside of my wheelhouse. I did try adding `headers : {'Accept-Encoding' : 'gzip' },` to the AJAX request. I'm met with the error `Refused to set unsafe header "Accept-Encoding"`. [This answer](https://stackoverflow.com/a/32525490/14073182) seems to suggest that one should *not* specify this in the AJAX request, and instead specify `Content-Encoding: gzip` in `index.php` (which is already done). – 10GeV Jul 01 '22 at 22:00
  • _"this code was previously able to read these gzipped JSON files without issue using an Ajax call "_ It sounds like something may have changed on the web server itself. What server software are you running (Apache, Nginx, etc.) and was it recently upgraded? – kmoser Jul 02 '22 at 03:11
  • @kmoser It looks like we're on Apache/2.4.6, which was last built a few months ago (I believe this issue cropped up later, but could be wrong). I have no familiarity with server software, unfortunately. What steps could I take to see if this is a server issue? – 10GeV Jul 02 '22 at 04:25
  • @10GeV You could try downgrading Apache, preferably in test environment. Compare the HTTP headers in the current Apache to those of the downgraded Apache to see what changed. – kmoser Jul 02 '22 at 19:08
  • I think you need to configure your server response by setting the the MIME type, header('Content-type: application/json'); – araldhafeeri Jul 05 '22 at 00:11

2 Answers2

2

If your server returns this header for the AJAX call:

Content-Type: application/x-gzip

without this one:

Content-Encoding: gzip

then the browser will not uncompress the response. For it to transparently uncompress it, you need something like this instead:

Content-Type: application/json
Content-Encoding: gzip

Adding Accept-Encoding or Content-Encoding headers to your HTML page (as you did) has of course no effect. You should remove them.

The headers sent by Apache depend on the way it is configured. Your server configuration may have been changed recently.

You have 2 solutions:

  • Modify your Apache configuration so that it sends the expected headers for .json.gz files
  • Use a PHP script to retrieve your compressed JSON files

Here is an example of such a script (getJSON.php):

<?php
$name = $_GET['name'];

header('Content-Type: application/json');
header('Content-Encoding: gzip');
readfile("path/to/$name.json.gz");
?>

The URL to retrieve test.json.gz would be:

url: 'getJSON.php?name=test'

Remark: as stated in a comment, a recommended practice is to check the value of $name (e.g. allow only letters and numbers) to forbid the use of ../ characters, which would allow an attacker to access .json.gz files outside of the intended directory.

UPDATE

You said in a comment that the following .htaccess file was present in the directory that contains the JSON files:

<filesMatch ".(gz)$">
    <ifModule mod_headers.c>
        Header set Content-Encoding "gzip"
    </ifModule>
</filesMatch>

The directive should send the Content-Encoding: gzip header for .gz files. However, .htaccess files may not be processed by Apache, depending on the value of the AllowOverride directive. If AllowOverride is set to None (and AllowOverrideList is also None, which is the case by default), then .htaccess files are ignored.

Note the default value for AllowOverride:

Default: AllowOverride None (2.3.9 and later), AllowOverride All (2.3.8 and earlier)

It means that, as of Apache 2.3.9, AllowOverride is set to None by default.

You said in a comment that you upgraded to version 2.4.6 recently. If AllowOverride is not explictly set in your Apache configuration, then it explains why your .htaccess file has no effect anymore.

A solution is to add this in your Apache configuration file:

<Directory /path/to/your/json/files/>
    AllowOverride All
</Directory>
Olivier
  • 13,283
  • 1
  • 8
  • 24
  • Your example of `getJSON.php` is obviously just a proof-of-concept but when implemented for real it should be updated to [sanitize user input](https://codereview.stackexchange.com/questions/250132/sanitizing-user-form-input-in-php). – kmoser Jul 02 '22 at 17:01
  • Thanks! I'll give the `PHP` script a try. I'm not terribly familiar with server software... how would one configure Apache to provide the appropriate headers (or, alternatively, what can I search for to find the appropriate documentation... I can't seem to find anything by Googling/searching the docs)? – 10GeV Jul 02 '22 at 18:49
  • The `.htaccess` in the relevant directory has ``` Header set Content-Encoding "gzip" ``` – 10GeV Jul 02 '22 at 19:03
  • @kmoser It's always better to sanitize, but in this case, even without sanitizing, all you can do is steal `.json.gz` files on the server, which is not very useful. – Olivier Jul 03 '22 at 06:50
  • @10GeV I think I figured it out. See my update. – Olivier Jul 03 '22 at 08:23
-2

You can try the following code in your ajax return code:

  let blob = res.data;
  let reader = new FileReader();
  reader.readAsDataURL(blob);
  reader.onload = (e) => {
    let a = document.createElement("a");
    a.download = 'Your File Name';
    a.href = e.target.result;
    document.body.appendChild(a);
    a.click();
    document.body.removeChild(a);
  };
  • Your answer could be improved by adding more information on what the code does and how it helps the OP. – Tyler2P Jun 29 '22 at 15:07
  • Thanks for the suggestion. I'm afraid it isn't clear to me how this works, nor what exactly is meant by the Ajax "return code" (I assume you mean function called in case of an error, as the "success" function is never called?). Could you clarify? – 10GeV Jun 29 '22 at 16:03