0

I use selenium version: '3.141.59' to access the content of a page with java, I am using the ChromeDriver to emulate the use of a web browser, my problem occurs when the query url contains special characters --charset=utf -8 since it decodes the character and the result of the web request is not as expected, this is a code example:

System.setProperty(
      "webdriver.chrome.driver",
      "C:\\Program Files\\Controllers\\chromedriver-win64\\chromedriver.exe"
 ); // -> chromedriver version 115.0.5790.102 
ChromeOptions options = new ChromeOptions();
options.addArguments("--charset=UTF-8");
WebDriver driver = new ChromeDriver(options);
driver.get("úrl/example/with/especial-character/"); // -> url: https://www.úrl/example/with/especial-character/

I have tried encoding the url using:

String encoderUrl = URLEncoder.encode("úrl/example/with/especial-character/", "UTF-8");
driver.get(encodeUrl)

But this modifies the url which is already encoded, so I'm out of ideas how to solve this, any useful method or library to solve this?

dependencies {
    testImplementation platform('org.junit:junit-bom:5.9.1')
    testImplementation 'org.junit.jupiter:junit-jupiter'
    // https://mvnrepository.com/artifact/org.seleniumhq.selenium/selenium-chrome-driver
    implementation group: 'org.seleniumhq.selenium', name: 'selenium-chrome-driver', version: '3.141.59'
    // https://mvnrepository.com/artifact/org.seleniumhq.selenium/selenium-support
    implementation group: 'org.seleniumhq.selenium', name: 'selenium-support', version: '3.4.0'
}

Toms_Hd3z
  • 129
  • 1
  • 11
  • *`// -> url: https://www.úrl/example/with/especial-character/`* Forgive me, I don't do Selenium but I take it that's what you see? If so, where - in the console? If so, which console? – g00se Aug 01 '23 at 20:57
  • This is how the ChromeDriver handler queries the web page, causing the content it returns to be wrong. – Toms_Hd3z Aug 01 '23 at 22:11
  • I believe the get() method has a built-in validator for the URL. But a URL can only contain standard ASCII characters methinks, so the whole question seems to be based on a URL that can't exist. – pcalkins Aug 01 '23 at 22:19
  • *This is how the ChromeDriver handler queries the web page, causing the content it returns to be wrong.* That doesn't tell me *where* you see that incorrect output – g00se Aug 01 '23 at 22:22
  • @g00se When my program runs, the chrome browser opens and displays the content of the page, the url does not take the "ú" character that I mentioned in the example, and replaces it with "ú". – Toms_Hd3z Aug 01 '23 at 22:28
  • @pcalkins: [https://фрезеровка.москва/](https://фрезеровка.москва/) – g00se Aug 01 '23 at 22:32
  • *When my program runs, the chrome browser opens and displays...* OK, so I understand, then, the answer to my question is "the address bar" (?) – g00se Aug 01 '23 at 22:33
  • @pcalkins I have run the code in debug mode, I put a breakpoint in my driver.get(url) line and before the wrong url is stored I have set the url that contains the character that is altered, continue the program and the site has been successfully loaded with the content, then I doubt that the URL'S must have ASCII characters compulsorily. – Toms_Hd3z Aug 01 '23 at 22:34
  • @g00se Exactly, the bar with the url... – Toms_Hd3z Aug 01 '23 at 22:36
  • 1
    Sorry couldn't help. I thought it just *might* be in the console – g00se Aug 01 '23 at 23:48
  • My thoughts on that is not so much special character as "illegal character" ! in the url. If the character belongs to a specific language script it may be possible to set locale on it and it's official iso charsets but that would be some work to collude the two together into a program coherently. However outside of a working national language script, url characters, and Latin 1 I would say it is illegal! – Samuel Marchant Aug 02 '23 at 01:49

2 Answers2

0

Selenium v3.141.59 is old and ancient now.


However using and WebDriver v4.10.0, I find the special characters similar just like accessing manually.

URL

https://www.úrl/example/with/especial-character/

Manual browser snapshot:

manual

Code block:

ChromeOptions options = new ChromeOptions();
options.addArguments("--start-maximized");
WebDriver driver = new ChromeDriver(options);
driver.get("https://www.úrl/example/with/especial-character/");

Browser snapshot:

SeleniumDriven

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
0

I have managed to solve this problem, ChromeDriver is not changing my character encoding internally, the url I am using is exporting from an enum class and the IDE I am using (intellij IDEA) is not encoding my files in utf-8, so which when receiving the value in the class the characters of the string are replaced, it was as simple as doing:

     byte[] asciiURL = url.getBytes();
     String utf8URL = new String(asciiURL, StandardCharsets.UTF_8);

And with that it is enough.

Toms_Hd3z
  • 129
  • 1
  • 11