I am doing a mini-project to collect some data from a popular League of Legends website, www.op.gg. For example, if you go to this page, you will see that there are 10 games worth of data shown on the right. If you keep scrolling down, you will see "Show More" at the bottom which will show the next 20 results, and so on. When I inspect the "Show More" element using chrome tools, I see the following entry:
<a href="#" onclick="$.OP.GG.matches.list.loadMore($(this)); return false;" class="Button">Show More</a>
I am currently using Scrapy to grab several datapoints from pages like this, and I have been successful in grabbing the first 10 games that show up but need some help on how to keep grabbing more until a set time period (i.e. keep showing more results until the "data-datetime" element in each GameItemWrap class is 30 days ago from runtime.)
My code is below:
import scrapy
class PostsSpider(scrapy.Spider):
name = "posts"
start_urls = [
'https://na.op.gg/summoner/userName=C9+Zven',
'https://na.op.gg/summoner/userName=From+Iron'
]
def parse(self, response):
summoner_name = response.css('.SummonerLayout>.Header>.Profile>.Information>.Name::text').get()
rank_type = response.css('.TierRankInfo .RankType::text').get()
tier_rank = response.css('.TierRankInfo .TierRank::text').get()
game_lists = []
dict_per_game = {}
for game in response.css('div.GameItemWrap'):
dict_per_game['summoner_id'] = game.css('.GameItem::attr(data-summoner-id)').get()
dict_per_game['data_game_time'] = game.css('.GameItem::attr(data-game-time)').get()
dict_per_game['game_type'] = str.strip(game.css('.Content .GameStats .GameType::text').get())
dict_per_game['date_time_epoch'] = game.css('.Content .GameStats .TimeStamp ._timeago::attr(data-datetime)').get()
dict_per_game['game_result'] = str.strip(game.css('.Content .GameStats .GameResult::text').get())
dict_per_game['champ_name'] = game.css('.Content .GameSettingInfo .ChampionName a::text').get()
dict_per_game['kill'] = game.css('.Content .KDA .KDA .Kill::text').get()
dict_per_game['death'] = game.css('.Content .KDA .KDA .Death::text').get()
dict_per_game['assist'] = game.css('.Content .KDA .KDA .Assist::text').get()
game_lists.append(dict_per_game)
dict_per_game = {}
yield {
'summoner_name': summoner_name,
'rank_type': rank_type,
'tier_rank': tier_rank,
'games': game_lists
}
# This is where I would like to add some code to retrieve more results for this profile going back to 30 days ago from runtime