I'm trying to create a program that will go through a bunch of tumblr photos and extract the username of the person who uploaded them.
http://www.tumblr.com/tagged/food
If you look here, you can see multiple pictures of food with multiple different uploaders. If you scroll down you will begin to see even more pictures with even more uploaders. If you right click in your browser to view the source, and search "username", however, it will only yield 10 results. Every time, no matter how far down you scroll.
Is there any way to counter this and have instead have it display the entire source for all images, or for X amount of images, or for however far you scrolled?
Here is my code to show what I'm doing:
#Imports
import requests
from bs4 import BeautifulSoup
import re
#Start of code
r = requests.get('http://www.tumblr.com/tagged/skateboard')
page = r.content
soup = BeautifulSoup(page)
soup.prettify()
arrayDiv = []
for anchor in soup.findAll("div", { "class" : "post_info" }):
anchor = str(anchor)
tempString = anchor.replace('</a>:', '')
tempString = tempString.replace('<div class="post_info">', '')
tempString = tempString.replace('</div>', '')
tempString = tempString.split('>')
newString = tempString[1]
newString = newString.strip()
arrayDiv.append(newString)
print arrayDiv