-1

I'm trying to make a code that will send an HTTP get request to a site and will save the content in an HTML file using scapy.

So if that my code:

r = Ether() / IP(dst=cache_ip) / TCP() / "GET /index.html HTTP/1.0 \n\n"
b = str(r)
http.write(b)
print "html page is ready, check your html files folder to check"
http.close()

And cache_ip is the IP of Google than the output should be the HTML page of Google saved in HTML file on my computer... Instad I get this:

ÿÿÿÿÿÿ<¨*â-LEC@ Þ QÚûPP PÛGET /index.html HTTP/1.0 

Now I know that there is a much better modulus for the work like requests but my mission is to do it in scapy only. If any one can point the problem and offer a better code for the job written using scapy it will be great

Edit - i have tried to use tcp handshake but now im trying to write the content into my html file and from some reason i cannot do it-

http = open(os.path.dirname(
    os.path.abspath(inspect.getfile(inspect.currentframe()))) + "/html_saved_pages/" + url + ".html", "w")

update_history(url)

syn = IP(dst=cache_ip) / TCP(dport=80, flags='S')
syn_ack = sr1(syn)
syn_ack
getStr = 'GET / HTTP/1.1\r\nHost:'+cache_ip+'\r\n\r\n'
request = IP(dst=cache_ip) / TCP(dport=80, sport=syn_ack[TCP].dport,
seq=syn_ack[TCP].ack, ack=syn_ack[TCP].seq + 1, flags='A') / getStr
reply = sr(request)
http.write(reply)
print "html page is ready, check your html files folder to check"

http.close()
The baron
  • 35
  • 6
  • Compressed? Check the content-type header. – tripleee Jun 26 '16 at 15:05
  • https://en.wikipedia.org/wiki/HTTP_compression – tripleee Jun 26 '16 at 15:28
  • Well i dont know how to extract the content and i didn't found any solution that why i came here – The baron Jun 26 '16 at 17:02
  • @Thebaron: Consider what the return type is for sr() in Scapy. I can tell you what it is not - it is not HTML in a String format that you can simply output to a file. Also you would have to ensure that the resulting HTML from Google isn't spread across multiple packets. Use wireshark to confirm. If you find that the HTML page is spread across multiple packets, perhaps ask a new question regarding how to assemble them together? – wookie919 Jun 27 '16 at 21:10

1 Answers1

0

In your given code, it isn't clear what http is, and as a result, there is no way of knowing exactly what http.write() and http.close() would do.

But in any case, if your aim is to retrieve HTML from the web using ONLY scapy, you must start with the TCP 3-way handshake, followed by the HTTP GET request:

How to create HTTP GET request Scapy?

Community
  • 1
  • 1
wookie919
  • 3,054
  • 24
  • 32