You can check if response.status_code == 429 and see if there is a value in the response telling you how long to wait for, then wait for the number of seconds you've been asked to.
I duplicated the issue here.
I couldn't find any information on how long to wait in the content or the headers.
I suggest putting in some throttles and adjust until you're happy with the results.
See https://data.stackexchange.com/stackoverflow/query/952/top-500-answerers-on-the-site for an example for getting user reputations from the Stack Exchange Data Explorer.
Example follows.
#!/usr/bin/env python
import time
import requests
from bs4 import BeautifulSoup
df={}
df['target']=[ ... ] # see https://data.stackexchange.com/stackoverflow/query/952/top-500-answerers-on-the-site
throttle = 2
whoa = 450
with open('results.txt','w') as file_handler:
file_handler.write('url\treputation\n')
for id in df['target']:
time.sleep(throttle)
url='https://stackoverflow.com/users/'+str(id)
print(url)
response=requests.get(url)
while response.status_code == 429:
print(response.content)
print(response.headers)
time.sleep(whoa)
response=requests.get(url)
html_soup=BeautifulSoup(response.text, 'html.parser')
site_title = html_soup.find("title").contents[0]
if "Page Not Found - Stack Overflow" in site_title:
reputation="NA"
else:
reputation=(html_soup.find(class_='grid--cell fs-title fc-dark')).contents[0].replace(',', "")
print('reputation: %s' % reputation)
file_handler.write('%s\t%s\n' % (url,reputation))
Example error content.
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Too Many Requests - Stack Exchange</title>
<style type="text/css">
body
{
color: #333;
font-family: 'Helvetica Neue', Arial, sans-serif;
font-size: 14px;
background: #fff url('img/bg-noise.png') repeat left top;
line-height: 1.4;
}
h1
{
font-size: 170%;
line-height: 34px;
font-weight: normal;
}
a { color: #366fb3; }
a:visited { color: #12457c; }
.wrapper {
width:960px;
margin: 100px auto;
text-align:left;
}
.msg {
float: left;
width: 700px;
padding-top: 18px;
margin-left: 18px;
}
</style>
</head>
<body>
<div class="wrapper">
<div style="float: left;">
<img src="https://cdn.sstatic.net/stackexchange/img/apple-touch-icon.png" alt="Stack Exchange" />
</div>
<div class="msg">
<h1>Too many requests</h1>
<p>This IP address (nnn.nnn.nnn.nnn) has performed an unusual high number of requests and has been temporarily rate limited. If you believe this to be in error, please contact us at <a href="mailto:team@stackexchange.com?Subject=Rate%20limiting%20of%20nnn.nnn.nnn.nnn%20(Request%20ID%3A%202158483152-SYD)">team@stackexchange.com</a>.</p>
<p>When contacting us, please include the following information in the email:</p>
<p>Method: rate limit</p>
<p>XID: 2158483152-SYD</p>
<p>IP: nnn.nnn.nnn.nnn</p>
<p>X-Forwarded-For: nnn.nnn.nnn.nnn</p>
<p>User-Agent: python-requests/2.20.1</p>
<p>Reason: Request rate.</p>
<p>Time: Tue, 20 Nov 2018 21:10:55 GMT</p>
<p>URL: stackoverflow.com/users/nnnnnnn</p>
<p>Browser Location: <span id="jslocation">(not loaded)</span></p>
</div>
</div>
<script>document.getElementById('jslocation').innerHTML = window.location.href;</script>
</body>
</html>
Example error headers.
{
"Content-Length": "2054",
"Via": "1.1 varnish",
"X-Cache": "MISS",
"X-DNS-Prefetch-Control": "off",
"Accept-Ranges": "bytes",
"X-Timer": "S1542748255.394076,VS0,VE0",
"Server": "Varnish",
"Retry-After": "0",
"Connection": "close",
"X-Served-By": "cache-syd18924-SYD",
"X-Cache-Hits": "0",
"Date": "Tue, 20 Nov 2018 21:10:55 GMT",
"Content-Type": "text/html"
}