0

The task is kind o simple, yet complicated.

I'm trying to: 1. get the captcha image from a specific internet page 2. load it into a Picture.box inside a form. 3. the user type the characters which are sent back to the respective input in that page.

The site is: http://www.receita.fazenda.gov.br/PessoaJuridica/CNPJ/cnpjreva/Cnpjreva_Solicitacao2.asp

The captcha image is downloaded from a static address, there is no unique id in the url or any "src" attribute inside its DOM element.

Having no problem at downloading the image and showing it in a form (even it being png).

The issue is that each time the link of the image is accessed it generates any random captcha, so when the code downloads it, it gets a different image than the one on the page, which characters do not match the ones expected.

I've made a manual test, to load in a alternate tab of the browser the direct link of the captcha, which obviously returned a different image from that on the page, but for my surprise, typing its character even it being different worked. So, if a new image is generated in the same enviroment (in this example Internet explorer), it will match the expecteded sequence in the input of the page. So i think a workaround would have to do with dealing with cookies or sort of, but idk how...

The direct link of the Captcha Image is: http://www.receita.fazenda.gov.br/PessoaJuridica/CNPJ/cnpjreva/captcha/gerarCaptcha.asp

The small piece of code which concerns to this is:

IMGurl = "http://www.receita.fazenda.gov.br/PessoaJuridica/CNPJ/cnpjreva/captcha/gerarCaptcha.asp" 'Direct link of the captcha image
    filepath = ThisWorkbook.Path & "\" & "captcha.png"
    fileR = URLDownloadToFile(0, IMGurl, filepath, 0, 0)'function to download the image file

    Load FrmCaptcha 'form to show the image downloaded
    FrmCaptcha.Show


    docweb.getElementsByTagName("input")(2).Value = FrmCaptcha.TextBox1.Value    'send the characters typed in the form to the input at the page.
    docweb.getElementsByTagName("input")(3).click    'clicks the button
    Unload FrmCaptcha     'unloads the form

Hope someone can help!

Musicodelic
  • 93
  • 1
  • 12
  • 1
    Let me guess, you have an excel column of CNPJ numbers and you're trying to screen scrape data out of this website. That's exactly what the captcha is designed to prevent, and it's doing a good job. – Paul Abbott Jan 04 '17 at 19:39
  • Human interaction is preserved - the characters will still have to be typed. Captcha will still do its job which is to prevent bots even if i succed. So, I do not undertand your comment, which isnt useful or helpful in any way at all. I will only automate the "copy" and "paste" of data... – Musicodelic Jan 04 '17 at 19:45
  • See [this question](http://stackoverflow.com/q/36636255/4088852). – Comintern Jan 04 '17 at 23:20
  • Thanks @comintern. Before posting I had stumbled on that post, but since i'm not familiar with Python, i skiped that and still hope there is a vba solution or workaround. There is a lot more code than this already working in the macro and the effort to transport it entirly to another code wold be a pain in the a**... but thx!!! – Musicodelic Jan 04 '17 at 23:28
  • Use the same method, not Python - take a screenshot, then crop it. – Comintern Jan 04 '17 at 23:29

0 Answers0