1

Here's my setup:

Server A is outside the DMZ, in the internal network of my company and hosts binary files. These files can be accessed (as soon as you're in the company network) through HTTP by calling a service running on this server and passing a file identifier under the form of a token. So when I'm on my company network, if I put a URL like http://serverBname:port/endpoint?token=123456&download in my browser's address bar, a file download starts.

Server B is in DMZ and exposes REST endpoints to end users on the internet for various usages.

Server B can trigger HTTP requests to server A, so basically if I log on server B , start a browser and put the URL here above in the address bar, file download starts

I'm now requested to let end users on the internet download the files from server A, but without exposing this server to the internet. So basically I'm requested to build a limited reverse proxy.

I kinda managed to do it but am facing 2 issues:

ISSUE 1:

When a end-user clicks the button to download the file, requests goes to server B which does what it has to do and initiates the file download. But the response to the end-user only begins when server B has downloaded the whole file from server A. As our internal network is not that fast, it means that when the file is big (we're currently trying with a 700MB file) the user has to wait several minutes between the click and the actual beginning of the download. Needless to say no one will wait that long (I wouldn't) and all users will assume something is wrong.

ISSUE 2:

This one is the most important. When the download finally starts, it goes very fast as there's actually no content in the response, only the headers, and the downloaded file appears to be 1KB.

I'm not sure what I'm doing wrong. I've explored many solutions online (inspired by this answer, this other one or this code, but each time got basically the same results as with my code. Here's the (simplified) code I had initially:

My Endpoint controller:

public class CommunicationController : ApiController
{
        [HttpGet]
        [Route("RetrieveLargeAttachments")]
        public HttpResponseMessage RetrieveLargeAttachments(string tokenId, string fileName)
        {
            return Request.CreateResponse(HttpStatusCode.OK, CommunicationAdapter.GetFileCachePath(tokenId, fileName));
        }
}

The method called by this controller:

public static HttpResponseMessage GetFileCachePath(string tokenId, string fileName)
{
    var fileCacheUrl =  "http://server:port/fc?{tokenId}&download";

    HttpWebRequest request = (HttpWebRequest)WebRequest.Create(fileCacheUrl);
    request.Method = "GET";
    HttpResponseMessage responseMessage = new HttpResponseMessage(HttpStatusCode.OK);

    using (var responseApi = (HttpWebResponse)request.GetResponse())
    {
        if (responseApi.StatusCode != HttpStatusCode.OK)
        {
            //Handle possible errors
        }
        using (var reader = new StreamReader(responseApi.GetResponseStream()))
        {
            responseMessage.Content = new StreamContent(reader.BaseStream);
            responseMessage.Content.Headers.ContentDisposition = new ContentDispositionHeaderValue("attachment");
            responseMessage.Content.Headers.ContentDisposition.FileName = fileName;
            responseMessage.Content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
        }
    }
    return responseMessage;
}

My goals are:

  1. to make the whole process as transparent as can be for the end-user, so that they can click the Download button and the download starts exactly as they would have clicked an anchor pointing to this file.
  2. that the file doesn't get loaded in server B memory anyhow so that several users requesting a download at the same moment won't kill my server
Laurent S.
  • 6,816
  • 2
  • 28
  • 40
  • Do the files truly have to live as files on disk on ServerA (or wherever they are on the internal network)? Can they be stored in a database? This may simplify the interactions, as they could be loaded directly through ServerB without ServerB first having to download them from ServerA – Jonathan Aug 08 '19 at 15:25
  • Short answer: yes they need to be physically there. More details: I simplified the architecture for the sake of this question but technically the files are stored within documentum (which itself stores files in filesystem if I'm not mistaken), we request documentum through a CXF API (on *server A*) and when the file is too big, instead of replying with the bytes this API copies the file from wherever documentum stores it onto server A and responds with a token so that the file can be retrieved using a given URL (used in the question). Once the file is downloaded, the API deletes it. – Laurent S. Aug 08 '19 at 15:35
  • Not a solution to the delay, but you could think of breaking the process down to 3 parts on ServerB: 1st = hit ServerA and tell it to start doing whatever it needs to do, quickly return true/false; 2nd = a method that returns whether ServerA has finished what it's doing, and you can poll this from the client on a 5 second or whatever retry. Either return false if not done or token if it is; 3rd = pass through token to get actual file. Like I say, won't sped it up, but could provide progress or status to the user. ('we are preparing your file', 'dlownloading') – Jonathan Aug 08 '19 at 20:19
  • try not to make a 2nd service as it will kill your SLA and overcomplicates thing. use a file share and grand the web application user the rights to the file share and remove the latency, availability and concurrency issues from your design. – Walter Verhoeven Jan 20 '21 at 09:13

0 Answers0