4

I get several marketing emails with url links that get redirected from site to site to site. I'd like to write a program to track each URL redirect using Delphi and Indy. I'd like to traverse each URL, record the full QueryString and any Cookies that may have been set during the process.

How do I do this using the Indy components that come with D2010?

  • If TIdHTTP doesn't expose an appropriate event for this, you'll have to make sure that it doesn't handle redirects automatically. Means you will have to respond to the "redirect" response code yourself and issue a new get for the given url. Sorry, no code, haven't done this myself yet. – Marjan Venema Jun 09 '12 at 16:25
  • `TIdHTTP` does have an `OnRedirect` event that gets triggered whenever it detects an HTTP-level redirection (redirects can also be accomplished using client-side scripting, which `TIdHTTP` cannot track). – Remy Lebeau Jun 09 '12 at 18:39

1 Answers1

6

First of all you need a HTTP client, which is TIdHTTP in Indy.

Now you will need a data structure that will hold your results:

  TRedirection = record
    queryString: String;
    cookies: TStrings;
  end;

  TRedirectionArray = array of TRedirection;

Create a class that does the work (a class is required, because the event functions are defined as procedure of object):

  TRedirectionTester = class
    private
      FRedirData: TRedirectionArray;
      procedure redirectEvent(Sender: TObject; var dest: string;
        var NumRedirect: Integer; var Handled: boolean; var VMethod: TIdHTTPMethod);
      procedure newCookie(ASender: TObject; ACookie: TIdCookie; var VAccept: Boolean);
    public
      function traverseURL(url: String): TRedirectionArray;
      property RedirData: TRedirectionArray read FRedirData;
  end;

This provides basic functionality - you can call traverseURL with an URL, and it will return a TRedirectionArray with the querystrings and cookies involved.

Then implement the OnRedirect event:

procedure TRedirectionTester.redirectEvent(Sender: TObject; var dest: string;
  var NumRedirect: Integer; var Handled: boolean; var VMethod: TIdHTTPMethod);
var
  redirDataLength: Integer;
begin
  Handled := True;

  redirDataLength := Length(FRedirData);
  SetLength(FRedirData, redirDataLength + 1);

  FRedirData[redirDataLength].queryString := dest;
  FRedirData[redirDataLength].cookies := TStringList.Create;
end;

This will add an entry in the array, and store the querystring of the redirection. As this redirection itself doesn't contain a cookie (cookies are set when requesting the redirected page), you can't add any cookies here yet.

That's why you will need an OnNewCookie handler:

procedure TRedirectionTester.newCookie(ASender: TObject; ACookie: TIdCookie; var VAccept: Boolean);
var
  redirDataLength: Integer;
begin
  VAccept := True;

  redirDataLength := High(FRedirData);
  if (Assigned(FRedirData[redirDataLength].cookies)) then
    FRedirData[redirDataLength].cookies.Add(ACookie.CookieText);
end;

This does nothing but adding the CookieText to the data set. That field contains a 'summary' of the cookie - it's the actual string data that is sent when requesting a page.

Finally, put it together by implementing the traverseURL function:

function TRedirectionTester.traverseURL(url: String): TRedirectionArray;
var
  traverser: TIdHTTP;
begin
  traverser := TIdHTTP.Create();
  traverser.HandleRedirects := True;
  traverser.OnRedirect := redirectEvent;
  traverser.CookieManager := TIdCookieManager.Create();
  traverser.CookieManager.OnNewCookie := newCookie;

  SetLength(FRedirData, 1);
  FRedirData[0].queryString := url;
  FRedirData[0].cookies := TStringList.Create;

  traverser.Get(url);

  Result := FRedirData;
end;

It doesn't do much: It creates the required objects, and assigns the event handlers. Then it adds the first url as the first redirection (even though it's not a real redirection, I added it for completeness). The call to Get then sends the requests. It will return after the final page is located and returned by the webserver.

I tested it with http://bit.ly/Lb2Vho.

This however only handles redirects that are caused by an HTTP status code 301 or 302. As far as I know it doesn't handle redirects that are done via <meta> tags or javascript. To add that functionality, you have to check the results of the call to Get, and parse that to search for such redirects.

Chris
  • 3,113
  • 26
  • 33
  • 1
    There is nothing in the HTTP protocol that prevents a redirect response from setting a cookie at the time of the redirect. Any HTTP response can set cookies. – Remy Lebeau Jun 09 '12 at 18:41
  • 1
    @Remy Lebeau: You are of course right. That's what the code does: It records cookies, regardless of the type of HTTP response. It's just that when you send a request to domain x.com, and it redirects you to domain y.com, then the cookie for domain y.com is set as a response to the request for domain y.com (not in the redirect before). – Chris Jun 09 '12 at 21:25
  • 1
    @Chris - I am unable to get this to work. I cannot use TIdCookie (undeclared identifier) so I used TIdCookieRFC2109 instead and it compiles fine. Now I'm getting an "AV at address" error when I try calling traverseURL. – Michael Riley - AKA Gunny Aug 19 '12 at 16:52
  • 1
    @CapeCodGunny: I'm using Delphi XE2, which also uses Indy 10. That's why I assumed that the cookie implementations remained the same. Unfortunately my version of Indy has only TIdCookie but not TIdCookieRFC2109. However, they should be more or less the same. In which function and line do you get the AV? – Chris Aug 24 '12 at 14:14
  • @RemyLebeau Hi, How to use this class with my Delphi Form? Can you write a 3 lines of code when I click the Button1, thanks. – XXXXXXXXXXXXXX Jan 24 '14 at 22:46
  • @XXXXXXXXXXXXXX: I am not the person who wrote this answer or the `TRedirectionTester` class it describes. – Remy Lebeau Jan 24 '14 at 22:56
  • 1
    @chris: there is a memory leak in `TRedirectionTester.traverseURL()`. The `TIdHTTP` and `TIdCookieManager` objects are not being freed. – Remy Lebeau Jan 24 '14 at 22:57
  • @Chris Hi, Can you write a 3 lines of Code when clicking a button would redirect from x.com to y.com, thanks. – XXXXXXXXXXXXXX Jan 25 '14 at 08:23