I am working on a blog and learning web development at the same time. I want to learn more about JSON so I am trying to implement a way to export the entire contents of my blog to JSON and later XML. I am hitting a lot of problems on the way, the biggest one being getting the url of the page which I want to render as JSON/XML dynamically. The code for my website can be found here. I still need to comment more and I have to implement a lot of functionalities. The main class which is responsible for exporting the contents to JSON is as follows :
class JSONHandler(BaseHandler):
#TODO: get a way to gt the url from the request
def get(self):
self.response.headers['Content-Type'] = 'application/json'
url = "http://www.bigb-myapp.appspot.com/blog"
#url = self.request.path_url
logging.info(url)
page = urllib2.urlopen(url).read()
soup = BeautifulSoup(page)
subject_list = []
day_list = []
content_list = []
subjects = soup.findAll('div', {'class' : 'subject-title'})
days = soup.findAll('div', {'class' : 'day'})
contents = soup.findAll('div', {'class' : 'post'})
for subject in subjects:
subject_list.append(subject.findAll(text = True))
for day in days:
day_list.append(day.findAll(text = True))
for content in contents:
content_list.append(content.findAll(text = True))
i = 0
for s, d, c in subject_list, day_list, content_list:
json_text = json.dumps({'subject': s[i][i],'day': d[i][i], 'content': c[i][i]})
i += 1
self.write(json_text)
I am also sure that the printing function is erroneous, but that is the easy part. As I said getting the url is proving to be a major difficulty.
I have tried to get the url form the environment variable and I also have tired webapp2's request handlers such as self.request.path_url
to no avail.
I am working with Google App engine and use the jinja2 template engine.
Thanks.