I have code which currently prints out data for each user from an XML file (obtained from a website) the XML updates as more users interact with it throughout the day. I currently have my code looping to download this data every 5 minutes.
Every time the code is ran it generates a list of users and their statistics, first 5 mins it prints users: a,b,c
second 5 mins it prints users : a,b,c,d,e
third 5 mins it prints users : a,b,c,d,e,f,g
What i need the code to do it to print first 5 mins: a,b,c second 5 mins: d,e third 5 mins: f,g
Some how recognising that some of the users have already been used. Each user does have a unique user id which i guess could be matched?
I enclose an example of my code, in case that helps.
import mechanize
import urllib
import json
import re
import random
import datetime
from sched import scheduler
from time import time, sleep
######Code to loop the script and set up scheduling time
s = scheduler(time, sleep)
random.seed()
def run_periodically(start, end, interval, func):
event_time = start
while event_time < end:
s.enterabs(event_time, 0, func, ())
event_time += interval + random.randrange(-5, 45)
s.run()
###### Code to get the data required from the URL desired
def getData():
post_url = "URL OF INTEREST"
browser = mechanize.Browser()
browser.set_handle_robots(False)
browser.addheaders = [('User-agent', 'Firefox')]
######These are the parameters you've got from checking with the aforementioned tools
parameters = {'page' : '1',
'rp' : '250',
'sortname' : 'roi',
'sortorder' : 'desc'
}
#####Encode the parameters
data = urllib.urlencode(parameters)
trans_array = browser.open(post_url,data).read().decode('UTF-8')
xmlload1 = json.loads(trans_array)
pattern1 = re.compile('> (.*)<')
pattern2 = re.compile('/control/profile/view/(.*)\' title=')
pattern3 = re.compile('<span style=\'font-size:12px;\'>(.*)<\/span>')
#########################################################################
##### The request sent from here all the way down including comments#####
#########################################################################
##### Making the code identify each row, removing the need to numerically quantify the number of rows in the xmlfile,
##### thus making number of rows dynamic (change as the list grows, required for looping function to work un interupted)
for row in xmlload1['rows']:
cell = row["cell"]
##### defining the Keys (key is the area from which data is pulled in the XML) for use in the pattern finding/regex
user_delimiter = cell['username']
selection_delimiter = cell['race_horse']
if strikeratecalc2 < 12 : continue;
##### REMAINDER OF THE REGEX DELMITATIONS
username_delimiter_results = re.findall(pattern1, user_delimiter)[0]
userid_delimiter_results = (re.findall(pattern2, user_delimiter)[0])
user_selection = re.findall(pattern3, selection_delimiter)[0]
##### Printing the results of the code at hand
print "user id = ",userid_delimiter_results
print "username = ",username_delimiter_results
print "user selection = ",user_selection
print ""
getData()
run_periodically(time()+5, time()+1000000, 3000, getData)
Please be nice with comments, I have been coding for a cumulative 11 days now, so also excuse any major errors in the code I am using, although it is working so far.
Kind regards
AEA