0

I'd like to get the information I want from the homepage below.

http://ticket.cgv.co.kr/Reservation/Reservation.aspx?MOVIE_CD=&MOVIE_CD_GROUP=&PLAY_YMD=&THEATER_CD=&PLAY_NUM=&PLAY_START_TM=&AREA_CD=&SCREEN_CD=&THIRD_ITEM=#

To be exact, I want to get all the information of li tag in movie-list nano has-scrollbar-y

<div class="movie-select">
   <div class="movie-list nano has-scrollbar-y" id="movie_list">
      <li class="rating-15" data-index="0" movie_cd_group="20018753" movie_idx="81626">
          *************************
          **the data that i want!**
          *************************

      <li class="rating-15" data-index="1" movie_cd_group="20018753" movie_idx="81626">
          *************************
          **the data that i want!**
          *************************
...
...

      <li class="rating-15" data-index="100" movie_cd_group="20018753" movie_idx="81626">
          *************************
          **the data that i want!**
          *************************

However, when i use the below code to crawling all the information on this homepage. i cannot get data within a particular tag(div class 'list-list').

url = 'http://ticket.cgv.co.kr/Reservation/Reservation.aspx?MOVIE_CD=&MOVIE_CD_GROUP=&PLAY_YMD=&THEATER_CD=&PLAY_NUM=&PLAY_START_TM=&AREA_CD=&SCREEN_CD=&THIRD_ITEM=#'
r = requests.get(url)
soup = BeautifulSoup(r.text)

when i check html page text that get from request.get, there was no data under like

</div>
<div class="movie-list nano has-scrollbar-y"  id="movie_list">
<ul class="content scroll-y" onscroll="movieSectionScrollEvent();"></ul>
</div>

but when i check chrome , All the information is there!

<div class="movie-list nano has-scrollbar-y" id="movie_list">
 <ul class="content scroll-y" onscroll="movieSectionScrollEvent();"         tabindex="-1">
  <li class="rating-15" data-index="0" movie_cd_group="20018753" movie_idx="81626">
   <a href="#" onclick="return false;">
   <span class="icon">&nbsp;</span>
   <span class="text">바이스</span><span class="sreader"></span></a></li> 

  <li class="rating-15" data-index="1" movie_cd_group="20019110" movie_idx="81721">
   <a href="#" onclick="return false;">
   <span class="icon">&nbsp;</span><span class="text">미성년</

   ...

So this is my Question.

how can i get all data within from this homepage?

QHarr
  • 83,427
  • 12
  • 54
  • 101
Soulduck
  • 569
  • 1
  • 6
  • 17

2 Answers2

2

The data is loaded via javascript.

1) Either use a method like selenium which will allow this rendering to occur before attempting to access

2) Use dev tools and examine the POST XHR to this http://ticket.cgv.co.kr/CGV2011/RIA/CJ000.aspx/CJ_HP_SCHEDULE_TOTAL_DEFAULT and see if it provides the info you want and can be replicated with requests

QHarr
  • 83,427
  • 12
  • 54
  • 101
1

Your issue is the onclick event. You need to interact with the javascript on that page before beautiful soup can parse it. See this previous answer https://stackoverflow.com/a/29385645/10981724

Feernot
  • 34
  • 8