I want to load RTL page over UTF_8 character encoding with Java 11 HttpClient .
this is my example code :
public class HttpClientFactory {
public static final HttpClientFactory client = new HttpClientFactory();
private CookieManager cookieManager;
private HttpClientFactory() {
initializeCookieManager();
}
private void initializeCookieManager() {
cookieManager = new CookieManager();
cookieManager.setCookiePolicy(CookiePolicy.ACCEPT_ALL);
CookieHandler.setDefault(cookieManager);
}
public HttpClient produceHttpClient() {
return HttpClient.newBuilder()
.cookieHandler(cookieManager)
.version(HttpClient.Version.HTTP_2)
.connectTimeout(Duration.ofSeconds(10))
.build();
}
}
public class TsetmcBrowser {
private static HttpClient client;
public static TsetmcBrowser instance = new TsetmcBrowser();
private TsetmcBrowser() {
client = HttpClientFactory.client.produceHttpClient();
}
public void testConnection() {
System.out.println("[*] Request Load Page");
HttpRequest request = HttpRequest.newBuilder()
.GET()
.uri(URI.create("http://www.tsetmc.com/Loader.aspx"))
.setHeader("User-Agent", "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0")
.setHeader("Accept-Language", "en-US,en;q=0.5")
.setHeader("Accept", "text/plain")
.build();
try {
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
System.out.println(new String(response.body().getBytes(), StandardCharsets.UTF_8));
System.out.println("--------------------------------------------------------------------");
} catch (Exception e) {
e.printStackTrace();
}
}
}
But return wrong response body !
response :
�t^��m�%���o��;;���0�<�������~��o_Z������5��kA�ڮ��z0����8�h��*�;�r�N���
How can fix this problem ?
UPDATE
I check this code with other RTL pages like:
https://www.farsnews.ir
https://sepehr.irib.ir