This is my first StackOverflow post so I'll try to describe my problem as good as I can.
I want to create a program to retrieve the reviews from TripAdvisor pages, I tried to do it via API but they didnt respond when I requested the API key, so my alternative is to do it with a WebCrawler.
To do so I have a Spring project and using HtmlUnit,a tool I never used, so in order to test it my first exercise is to retrieve the title of a webpage so I have the following code implemented:
@PostConstruct
public void init() throws Exception {
TimeZone.setDefault(TimeZone.getTimeZone("Europe/Madrid"));
getRequest.getPageName();
}
That calls the following method:
@Test
public void getPageName() throws Exception {
try (final WebClient webClient = new WebClient()) {
final HtmlPage page = webClient.getPage("https://www.tripadvisor.com");
System.out.println(page.getTitleText());
}
catch (Exception e){
System.out.println("ERROR " + e);
}
}
When I run the code with https://www.google.com I get the response "Google" as excpected, but when I try it with https://www.tripadvisor.com or https://www.youtube.com I get an error that I can't understand:
Caused by: net.sourceforge.htmlunit.corejs.javascript.EvaluatorException: syntax error (https://static.tacdn.com/assets/DDGchX.17d9b05f.js#1)
I did a quick research to see what does the problem mean, I found a couple of posts regarding a similar case, but I can't understand what is the cause. Is it related to a Javascript problem? Or a permissions problem?
If more information is required to analyze the problem do not hesitate on asking for it, thanks in advance for the spent time of any reader and sorry if i disrespected any of the StackOverflow rules/protocols.