0

I'm trying to scrape all news items from this website. They are not showing in the source code: http://finansdanmark.dk/nyheder/

I've tried using Firefox' LIVE Http Headers and Chrome's developer tool but still can't figure out what goes on behind the scenes.

This is my code so far:

r = requests.post("http://finansdanmark.dk/nyheder/proxy.gba")  
text = r.text  
print (text)

Can anyone help?

tolik518
  • 119
  • 2
  • 14
bib
  • 11
  • 3
  • The POST request requires a body, check it out on Chromes' developer tools, network tab. – Curro May 09 '17 at 16:19
  • Thanks Curro! I've taken another look at Chrome's network tab and tried this (among other things): url = 'http://finansdanmark.dk/nyheder/' headers = { 'Content-type': 'application/json', 'Accept': 'text/javascript', 'Connection': 'keep-alive', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36', } r = requests.get(url, headers=headers) text = r.text print (text) It still doesn't work. When you say 'body' - what does that mean? – bib May 09 '17 at 17:24
  • Sorry, I don't know how to write the code so that it is readable on this site. – bib May 09 '17 at 17:31
  • Problem solved :-) It turned out I didn't pass 'payload' in my POST request. 'Payload' is shown at the bottom of Chrome's 'Network' - > 'Headers'. I also had to add json.dumps to my request `r = requests.post(url, data=json.dumps(payload))`. Inspiration: http://stackoverflow.com/questions/15694120/why-does-http-post-request-body-need-to-be-json-enconded-in-python – bib May 11 '17 at 08:31

0 Answers0