0

I encountered a problem when I tried to scrape the review text from ratemyprofessor.com (http://www.ratemyprofessors.com/ShowRatings.jsp?tid=860968#). I am currently using BeautifulSoup and requests.

I would love to get all the review contents, while the data after clicking "Load More" is inaccessible. I have tried different ways that have been posted on StackOverflow and Reddit, unfortunately, none of them works for me.

The load more button under inspection: onclick="javascript:mtvn.btg.Controller.sendLinkEvent({ linkName:'PROF:LoadMore', linkType:'o' } );"

I would greatly appreciate if anyone could help me with this problem. Thank you.

Chloe
  • 3
  • 2
  • Is that data JS generated? – SuperStew Nov 19 '18 at 21:15
  • @SuperStew I'm not sure... This is in the element inspect : *onclick="javascript:mtvn.btg.Controller.sendLinkEvent({ linkName:'PROF:LoadMore', linkType:'o' } );"* – Chloe Nov 19 '18 at 21:49
  • I'm afraid this violates the terms of service: http://www.ratemyprofessors.com/TermsOfUse_us.jsp#section6 unless you have prior permission. – QHarr Nov 19 '18 at 22:22

2 Answers2

0

This appears to a JS website. I think you'll need to use something like Selenium to scrape this. By using Selenium you could direct the web browser to scroll to the end and capture all the data you are looking for that way.

Owais Arshad
  • 303
  • 4
  • 18
0

You need to use the chrome network tab so see what request is made when you click load more. In this case it's:

http://www.ratemyprofessors.com/paginate/professors/ratings?tid=860968&filter=&courseCode=&page=2

pguardiario
  • 53,827
  • 19
  • 119
  • 159
  • Just make the request and load the response with lxml/bs4 or whatever other html parser you use,. – pguardiario Nov 21 '18 at 06:49
  • I am using the following codes. I managed to load the page, but did not get the contents in your link. : `read_mores = driver.find_elements_by_xpath('//*[@data-teach-id='+ tid + ']') for read_more in read_mores: driver.execute_script("arguments[0].scrollIntoView();", read_more) driver.execute_script("$(arguments[0]).click();", read_more) soup = BeautifulSoup(driver.page_source, 'html.parser') ` – Chloe Nov 21 '18 at 06:54
  • Are you using `selenium`? You tagged your question with `beautiful soup`. – pguardiario Nov 21 '18 at 06:56
  • Oh yeah. I didn't know how to load pages with BS, therefore I used Selenium to load pages and use BS to parse it. Is there any way I can do it with BS only? I am very new to this area and have a lot of questions. Thanks for your patience! – Chloe Nov 21 '18 at 07:01
  • I think you should pick one and then post some code. Possibly in a new question if it's changed much. – pguardiario Nov 21 '18 at 08:36
  • Hey. I just want to let you know it works finally. Thank you so much for the answer!!! – Chloe Nov 28 '18 at 17:52