0

Sorry about the "two-way proxied download" terminology - am not sure how this would be otherwise called (correct terminology is much appreciated). Anyways:

Let's assume I have a PDF file of an article, test.pdf (see below for Latex example) of which I am an author, which is hosted on a conference website, and otherwise available there for free. Let's say I want to (and am allowed) to also distribute a copy of the same PDF from my website.

So, for the sake of the example, let's say:

  • a local PC has a (globally resolvable) IP address 80.80.80.80
  • my website server is at myserver.com, with IP address 90.90.90.90
    • The link to the PDF there is http://myserver.com/dl/test.php?file=./test.pdf
  • the conference website is at conference.org, IP address 100.100.100.100
    • The link to the PDF there is http://conference.org/2001/downloads/test.pdf

What I want to do is this: when a local PC the PDF file from my website (via http://myserver.com/dl/test.php?file=./test.pdf), the test.php script should also:

  • initiate a download of http://conference.org/2001/downloads/test.pdf, with the original header data of the client (that is, conference.org should see in their logs that it is 80.80.80.80 which requests), with my website as referrer (that is, 90.90.90.90 would be referrer); the idea of this is that the conference.org webhost would log the same clients that download as myserver.com does
  • The download from conference.org should be terminated after 100 bytes or so, so as not to waste the bandwidth of conference.org -- otherwise, it is myserver.com that serves the PDF file
  • Should the download from conference.org fail (e.g. if conference.org is temporarily offline), than that should be logged in a text file - but it should NOT otherwise interfere (as in, introduce additional delays) in the process of serving the file from myserver.com.

Here is an example of test.php, which only does the serving "from myserver.com"; otherwise, the relationship between files local to myserver.com, and their location on conference.org, is simulated in the $filesRelations array:

<?php

$filesRelations = array(
  './test.pdf'   => 'http://conference.org/2001/downloads/test.pdf',
);

if(!(isset($_GET['file']))) {
  echo "<html>
  <head/>
  <body>
  <a href='?file=./test.pdf'>test.pdf</a>
  <br/> <sub>(".$filesRelations['./test.pdf'].")</sub>
  </body>
  </html>
  ";
} else {
  # echo "-- " . $_GET['file'] . " -- "; # dbg
  $localpath = $_GET['file'];
  $fdname = basename($localpath);
  $fsize = filesize($localpath);
  $includeFile = file_get_contents($localpath);

  if ($includeFile === false)
  {
    echo "Error with $localpath";
  } else {
    header("Content-type: application/pdf");
    header("Content-Disposition: attachment; filename=\"".$fdname."\"");
    header("Content-length: $fsize");
    header("Cache-control: private");
    echo $includeFile;
  }
}

exit;
?>

How could I modify this code, so that the script "pings" the link (by initiating and terminating a short, 100-byte download) in the respective $filesRelations entry, using the header data of the calling client, before it serves the headers and the file (by echoing $includeFile)?


For testing, this is the test.tex file (which you can compile with pdflatex test.tex to obtain a test.pdf):

\documentclass{article}
\usepackage{lipsum}

\begin{document}
\title{Lorem Ipsum}
\author{Author's Name}
\maketitle

\begin{abstract}
\lipsum[1]
\end{abstract}

\section{Introduction}

\lipsum[1-12]
\end{document}

(To test, put test.php and test.pdf in one directory, run php-5.4.10 -S localhost:8000 in that directory, then visit http://localhost:8000/test.php in the web browser).

sdaau
  • 36,975
  • 46
  • 198
  • 278
  • 1
    I believe the stumbling block is `the original header data of the client`. The IP address of the client is not contained in a header, but determined by the TCP/IP layer of the socket that the client connected to. That's what would be in the logs. Ergo, can't be done AFAIK. – James May 30 '14 at 22:41
  • Thanks for the comment, @James; if not the entire "original header data", could I at least gain access to the client's IP in PHP on myserver (then I could probably somehow "fake" some header, with the client's IP inside)? EDIT: ups, just saw "TCP/IP layer" and "not contained in a header"; but I could read [`$_SERVER['REMOTE_ADDR']`](http://stackoverflow.com/questions/3003145/); could I then use `$_SERVER['HTTP_X_FORWARDED_FOR']` in the call from myserver to conference server? – sdaau May 30 '14 at 22:46
  • 1
    `$_SERVER['REMOTE_ADDR']` contains the IP address of the client connected. You could send that data to the other server in a special request which they would read - you wouldn't be able to do this in a sneaky way though it requires the other server to implement some kind of script to receive the data. Possibly http_x_forwarded_for, but I kind of doubt that will automatically go into the other server's logs as the client, but give it a try. – James May 30 '14 at 22:50
  • Sounds good, many thanks @James - cheers! (PS: if you want, post this as an answer, I'll accept it) – sdaau May 30 '14 at 22:53

1 Answers1

1

I believe the stumbling block is the original header data of the client. The IP address of the client is not contained in a header, but determined by the TCP/IP layer of the socket that the client connected to. That's what would be in the logs. So, it can't be done quite so simply.

The client's IP can be retrieved by your server via:

$clientIP = $_SERVER['REMOTE_ADDR'];

If you were able to come up with a mechanism to let the other server know that this was a request on behalf of a client, you could certainly send them this data. As you point out, perhaps try setting the HTTP_X_FORWARDED_FOR header in your request to that server.

Best of luck!

James
  • 20,957
  • 5
  • 26
  • 41