0

I have this url http://www.zara.com/qr/1260020210042 and I am trying to get the redirected final URL:

    String url = "http://www.zara.com/qr/1260020210042";
    Response response = Jsoup.connect(url).followRedirects(true).execute();     
    String url2 = response.url().toString();
    Response response2 = Jsoup.connect(url2).followRedirects(true).execute();
    System.out.println(response2.url());

but it doesn't print the final redirected URl , what shall I change? Thanks,

EDIT:

I tried also with Htmlunit but it doesn't give me the final link which I need:

        WebClient webClient = new WebClient(BrowserVersion.FIREFOX_45);
        webClient.getOptions().setJavaScriptEnabled(true);
        webClient.getOptions().setRedirectEnabled(true);
        webClient.getOptions().setThrowExceptionOnScriptError(false);
        webClient.getOptions().setCssEnabled(true);     
        HtmlPage page = (HtmlPage) webClient.getPage("http://www.zara.com/qr/1260020210042");
        WebResponse response = page.getWebResponse();
        String content = response.getContentAsString();
        System.out.println(page.getUrl());
kriegaex
  • 63,017
  • 15
  • 111
  • 202
Khalil M
  • 1,788
  • 2
  • 22
  • 36
  • It seems to me that the http://www.zara.com/qr/1260020210042 is not redirected at all. It returns 200 OK. – Bechyňák Petr Feb 06 '17 at 15:17
  • yes but if you click on the link it will – Khalil M Feb 06 '17 at 15:27
  • Then it is probably js related. Try it with HtmlUnit, then use redirected url with jsoup. – Frederic Klein Feb 07 '17 at 17:55
  • Possible duplicate of [JSoup + Link extraction + redirect URL](http://stackoverflow.com/questions/41853684/jsoup-link-extraction-redirect-url) – Frederic Klein Feb 07 '17 at 17:56
  • @FredericKlein thanks for your answer, I tried the code there and it is throwing me net.sourceforge.htmlunit.corejs.javascript.EvaluatorException: JAvascriptvalue is a type com.gargoylesoftware.htmlunit.ScriptException – Khalil M Feb 07 '17 at 21:42

1 Answers1

1

The HtmlUnit solution suggested by Frederic Klein actually works nicely, but there is a cookie-related caveat, see "update" comment below.

First add this dependency to your Maven configuration:

<dependency>
  <groupId>net.sourceforge.htmlunit</groupId>
  <artifactId>htmlunit</artifactId>
  <version>2.25</version>
</dependency>

Then use it like this:

package de.scrum_master.stackoverflow;

import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.WebClientOptions;
import org.jsoup.Connection.Response;
import org.jsoup.Jsoup;

import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;

import static com.gargoylesoftware.htmlunit.BrowserVersion.CHROME;
import static java.util.logging.Level.OFF;
import static java.util.logging.Logger.getLogger;

public class Application {
  public static void main(String[] args) throws IOException {
    WebClient webClient = createWebClient();
    String originalURL = "http://www.zara.com/qr/1260020210042";
    String redirectedURL = webClient.getPage(originalURL).getUrl().toString();
    Response response = Jsoup.connect(redirectedURL).execute();
    System.out.println(response.url());
  }

  private static WebClient createWebClient() throws MalformedURLException {
    getLogger("com.gargoylesoftware").setLevel(OFF);
    WebClient webClient = new WebClient(CHROME);
    WebClientOptions options = webClient.getOptions();
    options.setJavaScriptEnabled(true);
    options.setRedirectEnabled(true);
    // IMPORTANT: Without the country/language selection cookie the redirection does not work!
    webClient.addCookie("storepath=us/en", new URL("http://www.zara.com/"), null);
    return webClient;
  }
}

The console log says:

http://www.zara.com/us/en/man/shoes/leather/brown-braided-leather-ankle-boots-c0p4065286.html

Update: Okay, I found the root cause of your problem. It is not HtmlUnit but the very fact that redirection on zara.com just does not work before the user has manually selected country + language during his first visit with any browser. The info is stored in a cookie named storefront without which every browser session will always land at the front page with the country selection dialogue again. I have updated my sample code so as to set that cookie to USA + English. Then it works.

Enjoy!

kriegaex
  • 63,017
  • 15
  • 111
  • 202
  • The problem is that a real browser behaves the same way. Try a browser with deleted cookies and cache: When you open the URL you first have to choose a country and click OK. Then you get redirected to an error page, which is a problem on the Zara homepage itself. Only then, if you open the same URL the next time, it works. A browser like HtmlUnit which always starts with a new session does not have those cookies, so it simply cannot work because you simulate a new user. HtmlUnit just behaves like a normal browser would for a new user, try for yourself! – kriegaex Feb 24 '17 at 00:06
  • Okay, I have updated the answer, now it works. You can see that you need a certain cookie and how to set it in HtmlUnit. – kriegaex Feb 24 '17 at 00:30