2

I have looked at the other similar posts but nothing obvious is jumping out. I am sure someone will point me in the right direction if I missed it!

Issue is this code in my app used to work but no longer. So I am assuming something has changed at the website. I use exactly the same code for three other websites in the same app and they are working fine. LOGCAT shows the following error:

org.jsoup.HttpStatusException: HTTP error fetching URL. Status=403, URL=http://notamweb.aviation-civile.gouv.fr/Script/IHM/Bul_Aerodrome.php

I made this simple webpage which I can launch from a local drive and it works (If you try it yourself you need to adjust the date and time to current UTC time):

<form method="post" action="http://notamweb.aviation-civile.gouv.fr/Script/IHM/Bul_Aerodrome.php">

 Enter aerodrome ID(s)

 <input type="text" name="AERO_Tab_Aero[0]"> 

        <input type="hidden" name="AERO_Date_DATE" value="2016/01/25">
        <input type="hidden" name="AERO_Date_HEURE" value="07:12">

        <input type="hidden" name="bResultat" value="true">
        <input type="hidden" name="ModeAffichage" value="COMPLET">

        <input type="hidden" name="AERO_Duree" value="96">
        <input type="hidden" name="AERO_CM_REGLE" value="1">
        <input type="hidden" name="AERO_CM_GPS" value="2">
        <input type="hidden" name="AERO_CM_INFO_COMP" value="1"> 
     <p>
        <input type="Submit" value="Get the bulletins">
     </p>

</form>

This code returns the error:

doc = Jsoup.connect("http://notamweb.aviation-civile.gouv.fr/Script/IHM/Bul_Aerodrome.php")
                    .data("bResultat", "true").data("ModeAffichage", "COMPLET")
                    .data("AERO_Date_DATE", date).data("AERO_Date_HEURE", time).data("AERO_Duree", "96").data("AERO_CM_REGLE", "1").data("AERO_CM_GPS", "2")
                    .data("AERO_CM_INFO_COMP", "1").data("AERO_Tab_Aero[0]", params[0].substring(0, params[0].length() - 1))
                    .userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36")
                    .timeout(6000).post();

Thoughts?

EDIT #1: The headers that I see when I use my mini webpage are:

REQUEST HEADERS Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8

Accept-Encoding:gzip, deflate

Accept-Language:en-US,en;q=0.8,en-AU;q=0.6

Cache-Control:max-age=0

Connection:keep-alive

Content-Length:180

Content-Type:application/x-www-form-urlencoded

Host:notamweb.aviation-civile.gouv.fr

Origin:null

Upgrade-Insecure-Requests:1

User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36

FORM DATA

AERO_Tab_Aero[0]:KLAX

AERO_Date_DATE:2016/01/25

AERO_Date_HEURE:11:21

bResultat:true

ModeAffichage:COMPLET

AERO_Duree:96

AERO_CM_REGLE:1

AERO_CM_GPS:2

AERO_CM_INFO_COMP:1

Gavin
  • 460
  • 4
  • 11
  • what type of authorization do you use? – Bhargav Jan 25 '16 at 07:42
  • 1
    http://stackoverflow.com/questions/14467459/403-error-while-getting-the-google-result-using-jsoup – Nouman Ghaffar Jan 25 '16 at 07:42
  • http://stackoverflow.com/questions/24471382/jsoup-error-403-when-trying-to-read-the-contents-of-a-directory-on-my-website – Nouman Ghaffar Jan 25 '16 at 07:43
  • Nouman your first link says to use a useragent which you will see I am already doing. The second link is about accessing a directory which I am not trying to do. – Gavin Jan 25 '16 at 08:45
  • Bhargav. I am using no authentication as the webpage does not require it. You can see from my simple HTML document that I just pass a bunch of inputs and I get back a webpage? – Gavin Jan 25 '16 at 08:46
  • Can you provide the values for the variables `date`, `time` and the array `params`? – Stephan Jan 25 '16 at 09:49
  • 1
    A way to fix this would be to load the page in your desktop browser, and look at the network tab of the developer tools to see what exactly it's sending, especially the cookies and headers. My guess is that you need to send other / additional cookies, or maybe a Referer header, as the website may be checking for this, and then send same or similar headers / cookies with your request. – Jonas Czech Jan 25 '16 at 10:11
  • Stephan - params you could use KLAX the date and time are the current UTC time in the format yyyy/mm/dd and hh:mm – Gavin Jan 25 '16 at 11:29
  • @JonasCz - The only cookie I see is a PHPSESSIONID which is returned with the webpage. For the outgoing headers I see the edit above. – Gavin Jan 25 '16 at 11:36
  • Hmm, don't see anything wrong - will have a try when I get back to my computer later today, and see if I can get it going. – Jonas Czech Jan 25 '16 at 11:55
  • @JonasCz - SOLVED! You got me thinking more about the information passed to the webpage. I noticed the only thing that wasnt set was date and time. Then I noticed the time in the emulator was wrong! So I hard coded it temporarily and voila (as the French say!) it is working. Thanks so much for the ideas and offer to look into it further. Cheers! – Gavin Jan 25 '16 at 11:58
  • That's great you solved it :-) – Jonas Czech Jan 25 '16 at 13:16
  • @Gavin can you post as an answer your working code? – Stephan Jan 25 '16 at 13:32
  • @Stephan my original code works as I first posted. The problem was the emulator clock is wrong so the time passed to the webpage was wrong and thus the webpage was rejecting it. – Gavin Jan 25 '16 at 23:12

2 Answers2

1

Problem was solved. Issue was emulator clock was wrong causing webpage to reject request.

Gavin
  • 460
  • 4
  • 11
0

Helping idea from JonasCz:

A way to fix this would be to load the page in your desktop browser, and look at the network tab of the developer tools to see what exactly it's sending, especially the cookies and headers. My guess is that you need to send other / additional cookies, or maybe a Referer header, as the website may be checking for this, and then send same or similar headers / cookies with your request.

Community
  • 1
  • 1
Stephan
  • 41,764
  • 65
  • 238
  • 329