0

first time posting here.

I am experiencing the strangest problem and can honestly say I have never seen this before. I have an internal URL forwarding service where my clients can create a keyword and have that keyword redirect to the target they specify. This has been working GREAT, however, today I was informed of an issue with a redirect to a PDF.

One of my users created a short-URL to a PDF and complained the click-through stats were way off. When I researched the issue, I noticed quite a few clients attempted what I would call a redirect loop.

Essentially they kept requesting the short-URL over and over, with different byte ranges. This keeps happening; sometimes I see 60+70 of these in a row. I've tried changing the cache headers etc. etc, but nothing I do seems to be able to fix this. Even tried changing the 302 to a 301. No Luck. :( Any feedback would greatly be appreciated. Thanks Guys!

Here is a snippet I captured via tcpdump:

GET /ShowcaseGuide HTTP/1.1
Accept: */*
Referer: wwww.someinternalserver.com
Accept-Language: en-US
User-Agent: Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)
Accept-Encoding: gzip, deflate
Host: goto.mydomain.org
DNT: 1
Connection: Keep-Alive
Cookie: s_pers=xxxxxx


HTTP/1.1 302 Found
Date: Tue, 19 Nov 2013 20:09:52 GMT
Server: Apache/2.2.3 (Red Hat)
X-Powered-By: PHP/5.3.3
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: private, must-revalidate
Pragma: no-cache
Location: http://internalserver.mydomain.org/links/Get_the_Most_Out_of_the_Showcase.pdf
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 20
Keep-Alive: timeout=5, max=200
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8

GET /ShowcaseGuide HTTP/1.1
Accept: */*
Range: bytes=2178560-2179071, 2179072-2179369
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)
Host: goto.mydomain.org
DNT: 1
Connection: Keep-Alive
Cookie: s_pers=xxxxxx

HTTP/1.1 302 Found
Date: Tue, 19 Nov 2013 20:09:56 GMT
Server: Apache/2.2.3 (Red Hat)
X-Powered-By: PHP/5.3.3
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: private, must-revalidate
Pragma: no-cache
Location: http://internalserver.mydomain.org/links/Get_the_Most_Out_of_the_Showcase.pdf
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 20
Keep-Alive: timeout=5, max=199
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8

GET /ShowcaseGuide HTTP/1.1
Accept: */*
Range: bytes=1867776-1884159
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)
Host: goto.mydomain.org
DNT: 1
Connection: Keep-Alive
Cookie: s_pers=xxxxxx

HTTP/1.1 302 Found
Date: Tue, 19 Nov 2013 20:09:57 GMT
Server: Apache/2.2.3 (Red Hat)
X-Powered-By: PHP/5.3.3
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: private, must-revalidate
Pragma: no-cache
Location: http://internalserver.mydomain.org/links/Get_the_Most_Out_of_the_Showcase.pdf
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 20
Keep-Alive: timeout=5, max=198
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
  • This question does not appear to be about programming within the scope defined in the [help center](http://stackoverflow.com/help/on-topic). – kjhughes Nov 19 '13 at 21:48
  • Can you post code (if any) for how users get to the PDF? Is it just a link? Also, post any rewrite code (.htaccess / httpd.conf) or other code used to send users to the PDF from the shorter link. – Mat Carlson Nov 19 '13 at 22:02
  • Thanks for the responses guys. So essentially the PDF link is on a different server, and within my PHP page, after querying MySQL for the target URL, I do a header('Location: ' . $target, TRUE,301); followed by an exit; – GreenMotion Nov 19 '13 at 23:04

1 Answers1

0

Maybe do something like this instead of using a redirect? Should display in browser

/ShowcaseGuide/index.php:

<?php
header('Content-type: application/octet-stream');
header('Content-Disposition: inline; filename="Get_the_Most_Out_of_the_Showcase.pdf"');
readfile('../links/Get_the_Most_Out_of_the_Showcase.pdf'); 
?>

It could be that the browser starts downloading the PDF, then goes to the initial page (/ShowcaseGuide/index.php), then is redirected again to the same file. It's probably browser-dependent if it's happening.

Change the second header line to header('Content-Disposition: attachment; filename="Get_the_Most_Out_of_the_Showcase.pdf"'); if you want it to download instead of view in browser.

Mat Carlson
  • 543
  • 3
  • 12
  • Thank you for your response; I appreciate the feedback. So the PDF file is actually on a remote server. My code is running on server A, and the PDF is on server B. Hence why I am using header('Location: ' . $target, TRUE,301); So I don't think I will be able to use your suggestion :( – GreenMotion Nov 19 '13 at 23:05
  • In that case, maybe you should try to display it in a blank webpage (which can include Javascript Analytics) - http://stackoverflow.com/questions/291813/recommended-way-to-embed-pdf-in-html. You can also get the file through your server (using `readfile` or `fopen`) if you think bandwidth and file size won't be an issue, otherwise you could do a simple cache system with PHP on your server - Cache the version once, then check the modified time with every request (via http headers). One more way - Check if you get your script location as $_SERVER[HTTP_REFERER]. That may prevent the loops itself. – Mat Carlson Nov 20 '13 at 15:39