1

On many different older pages, I have seen captchas like this, i.e. static JPG/PNG images:

enter image description here

enter image description here

enter image description here

All of these have a grey-to-white gradient and wobbly lines in the background, as well as distorted sans-serif characters (only a-z and 0-9) as the captcha text, which is why I assume that they all come from the same library. Probably the standard settings of this library are used, leading to a similar appearance on different pages.

The bottom two images come from this dataset, part of the paper A Comparison of Supervised Learning Algorithms to Solve CAPTCHAs. Unfortunately, I did not find a mention of the used captcha generation algorithm in that paper though. Notably, the Keras example page OCR model for reading Captchas also uses this dataset. The page PWNtcha does not list this kind of captcha either.

So, can someone tell me which library it actually is?

sigalor
  • 901
  • 11
  • 24

1 Answers1

1

I found it, it's a library called kaptcha (Internet Archive version of the Google Code page here), which seems to be a fork of the SimpleCaptcha Java library. kaptcha was created by Jon Stevens (see here), who cloned the Google Code repo to GitHub here. GitHub user penggle has also cloned that repository and made it available on Maven.

To try it out without spinning up a huge Maven project, e.g. to generate a lot of training data for a machine learning algorithm, do the following:

  1. Download kaptcha-2.3.2.zip from kaptcha's downloads page.
  2. From that ZIP file, extract kaptcha-2.3.2.jar and put it into a new directory.
  3. In the directory of the JAR file, create a new file called Main.java and put the following code into it. As expected, all these pages just use the default settings from the file src/java/com/google/code/kaptcha/util/Config.java in the ZIP. Of course you can replace the use of DefaultTextCreator (which even only uses the letters abcde2345678gfynmnpwx by default) by any string you want.
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;
import java.util.Properties;
import javax.imageio.ImageIO;

import com.google.code.kaptcha.util.Config;
import com.google.code.kaptcha.impl.DefaultKaptcha;
import com.google.code.kaptcha.text.impl.DefaultTextCreator;

public class Main {
    public static void main(String[] args) throws IOException {
        Properties properties = new Properties();
        Config config = new Config(properties);
        
        DefaultKaptcha kaptcha = new DefaultKaptcha();
        kaptcha.setConfig(config);
        DefaultTextCreator textCreator = new DefaultTextCreator();
        textCreator.setConfig(config);
        
        BufferedImage image = kaptcha.createImage(textCreator.getText());
        
        File output = new File("out.jpg");
        ImageIO.write(image, "jpg", output);
    }
}
  1. Run this Java code using the following two shell commands:
javac -cp kaptcha-2.3.2.jar Main.java
java -cp kaptcha-2.3.2.jar:. Main
  1. The output JPG you will get will look like this:

example output

  1. To change the config, add the line import com.google.code.kaptcha.Constants; at the top of Main.java and then, for example, add the command properties.setProperty(Constants.KAPTCHA_BORDER, "no"); before creating the Config object:

enter image description here

sigalor
  • 901
  • 11
  • 24