4

I am developing an application for my school website and I'm using jsoup for parsing the html.

I'm facing a problem with captcha image I see this question and I had implemented but I am not getting the same image as is showed in the website.

How can I get the same image captcha, the website is using BotDetectCaptcha I am a little confused how can I do it specifically on my website

School Website

enter image description here

Community
  • 1
  • 1
Jonathan Axel
  • 67
  • 2
  • 6
  • 1
    You probably need cookies. – SLaks Dec 27 '15 at 20:27
  • I am using this piece of code, i get cookies but how does that help me? `Connection.Response response = Jsoup.connect(url) .timeout(300000) .userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0") .method(Connection.Method.GET).execute(); cookies = response.cookies();` – Jonathan Axel Dec 28 '15 at 04:20
  • 1
    @JonathanAxel I think it's the philosophy of captchas to change every time you load it `:)` – frogatto Dec 28 '15 at 10:19

2 Answers2

9

As stated in SLaks comment, you may be missing some cookies.

Here is a working example with the provided url:

// Load the initial page for getting the required cookies
Connection conn = Jsoup.connect("https://www.saes.upiicsa.ipn.mx/");
Document d = conn.get();

Element captcha = d.select("#c_default_ctl00_leftcolumn_loginuser_logincaptcha_CaptchaImage").first();
if (captcha == null) {
    throw new RuntimeException("Unable to find captcha...");
}

// Fetch the captcha image
Connection.Response response = Jsoup //
        .connect(captcha.absUrl("src")) // Extract image absolute URL
        .cookies(conn.response().cookies()) // Grab cookies
        .ignoreContentType(true) // Needed for fetching image
        .execute();

// Load image from Jsoup response
ImageIcon image = new ImageIcon(ImageIO.read(new ByteArrayInputStream(response.bodyAsBytes())));

// Show image
JOptionPane.showMessageDialog(null, image, "Captcha image", JOptionPane.PLAIN_MESSAGE);

OUTPUT

enter image description here

Tested on JSoup 1.8.3

Stephan
  • 41,764
  • 65
  • 238
  • 329
1

You said that you don't get the same image that you see on the website... That's normal because everytime you refresh the page the image is different.

Marcello Davi
  • 433
  • 4
  • 16
  • Yeah, but when I do the method get on android i'm getting an image with 4 elements on it, if you can see in the website overtime you refresh it is a 3 elements image and i don't know why – Jonathan Axel Dec 28 '15 at 04:19