I've successfully used BeautifulSoup to iterate through a few hundred pages of the bandsintown webpage, viewed here: https://www.bandsintown.com/?came_from=257&page=102
I'm able to iterate through each page to create an array of all event dates, called "uniqueDatesBucket". Printing the array gives the me following, seen below (there are many results, I've included a sample below).
print uniqueDatesBucket
Result:
[[<div class="event-b58f7990"><div class="event-ad736269">JAN</div><div class="event-d7a00339">08</div></div>, <div class="event-b58f7990"><div class="event-ad736269">JAN</div><div class="event-d7a00339">08</div></div>, ............................<div class="event-b58f7990"><div class="event-ad736269">JAN</div><div class="event-d7a00339">31</div></div>]]
This is as expected. I then want to place the Month and Day in separate arrays, in order to start building a database of dates. Here's the code:
#Build empty array for month/date
uniqueMonth = []
uniqueDay = []
for i in uniqueDatesBucket[0]:
uniqueMonthDay = i.find_all('div')
uniqueMonth.append(uniqueMonthDay[0].text)
uniqueDay.append(uniqueMonthDay[1].text)
print uniqueDay
The result is:
[u'08', u'08', u'08', u'08', u'08', u'08', u'08', u'08', u'08', u'09', u'09', u'09', u'09', u'09', u'09', u'09', u'09', u'09']
My question is, why is this only returning 18 results (there are 18 events on the landing page of the bandsintown page, but I thought I solved this using the page iterator described previously)? There are clearly more than 18 results shown in the uniqueDatesBucket element, which is the parent of uniqueMonth array.
Also, what is the "u" before each date in the results?