0

A GET request downloads following output (checked the response with Chrome Dev Tools):

HTML output

<style>
.xdebug-error {
  display:none;
}
</style>
<link rel="stylesheet" href="assets/css/xyz.css">

some numbers

<script>
$(function(){
    $('#a_gen div').html(
        '<a href="https://example.com/uid" target="_blank">click to make an account</a><div id="gen_warn">Do Not Leave This Page Until Account Is Made!</div>'
    );
});
</script>

Output via response.content

When I am printing response.content to the console or to a file I am getting something like this:

b'\x833\x01\x00\xe4R\xa7sh\xd8\x15P\x80\x0c\n \x95\xa6\x9a\xde\xd0\xe8\xa4\x9aZ\xb6\xdc\x81\xdb\x07\xad2I\xbb5\x7f\n\xb38\xb0\xb4\x15h[\xe0\x05\xdc\x02\x0b,s\xd7\x8f|\x95:\xd7\x90<6\xb7\xb7=?\xf8\xa60\xa8~\x19\xa0\x85V\x05<\x8f{\xbft\xc4n\xe1D1\xb6\xd9\x1e\x98\xfd\x94\xea\xfb\x10\xf8\x82\xee\xfb\x02\x05\xf5\xee\x07\x9eZ\xd5}?5\x88\xcaR[\x94Zb]j\xb4\xebb[ \r\xb9NH\xb4\xe7\x07\xc3O\x07\x89h\xc6\xcd\r~\x13\xd1H&\x9fK{_\\x\xb5\x80!\xc3\xf9\xc8\x15t\x11\x04\xf3\xb9\x07\x04\xf8\x1dA`!\xa5\xa1\xd2\xbd\x0c\xfe\xf5p\xa8\xfa\xf9\xf9Y\xa5\x0e\xbb\x83\xe1\xb0F\x96\xe9T(\xb7\x1c&X\\Xp5\x9c\xef\xa8\xdf&\xf5z\xb3\xd6nf=\x10\xe4*\xfb\x88\xa5\x98\x8c\xb1\xbc\xc0\xf2\x027\x82\xe5E]\x82\xe5\x85\xceY\xe0\xb0\x100Y\xa8a`?<\xacg\x98\xcc\xae\x07g\xba~x!\x97\xb7\xc6\x0fY\xeac\xf1\x85}\x9e\xc6X\xae\x93Y\xc7n\x98\xdc\xd3a\xcc\x061\x99\x99*\xf5!\x0b\xc3\xe9\x97\x1c\x99a2\xbb\xca\xbf}\xf0\x01\x8d5\xfe\x01h\xe7h\x9e\xea}\xa3\xa9\xea\x19\xc2i}@\x154\xa5\x8e~G6x\x80\xb2\x8dt\xee\x80\xbey\xe8!K\x98\xa4\xb2Y:\x7f\x83\x16\xb0\xd7O\xd5c\xa9\xc1\x8c\xa3\x03\x0f\xd0\x0e\xd4\x0f\xf8,\xa0uR-@\x0f,p(\xe2>\x85\xd6>\xda\xab\x06$s\x85n"\xfa_\xe8&\xa2=\xc1\xd7=\xe7=\x18\x18\x03'

Output via response.text

With response.text I got this (as depicted in image):

response as text

Original Code

All variables are already defined:

s = requests.Session()
r = s.get(url,headers = headers)
print(r.text)
            
if (r.status_code == 200):
    print("Generated Successfully")
    with open("Alt.txt", 'a') as f:
        f.write(str(r.text) + '\n')
else:
    print("BAD Request " + str(r.status_code))
    s.cookies.clear()

How can the plain-text response be written in a text file or to console?

hc_dev
  • 8,389
  • 1
  • 26
  • 38

1 Answers1

1

For evaluating a response from an arbitrary GET request, you should always evaluate the response.headers.

The header with key Content-Type tells you something about the MIME type like text/html or application/json of a response and its encoding like UTF-8.

In your case the result of response.headers['Content-Type'] probably would return "text/html; charset=UTF-8".

So you know, that you need to decode the response from UTF-8 as Parvat. R commented by r.content.decode('utf-8').

Here we can

  • either use response.encoding to dynamically decode the response.text based on response's given encoding
  • or we can simply use response.content to get the bytes as binary representation (e.g. b'\x833\x01')

Since you claim the response was text/HTML (as seen in browser), you could simply decode the textual representation and append it to the text-file:

s = requests.Session()
r = s.get(url,headers = headers)
print(r.text)
            
if (r.status_code == 200):
    print("Generated Successfully")

    # detect encoding and decode respectively
    print("Response encoding", r.encoding)
    body_text = r.text.decode(r.encoding)
    with open("Alt.txt", 'a') as f:
        f.write(str(body_text) + '\n')  # print body as string to file
else:
    print("BAD Request " + str(r.status_code))
    s.cookies.clear()

See also: python requests.get() returns improperly decoded text instead of UTF-8?

hc_dev
  • 8,389
  • 1
  • 26
  • 38