2

I'm using selenium to write an online letter to my friends in army. The website offers no APIs or whatsoever. Quite a few of my friends are in the army and I wanted to choose who to send the letter to. Let's say my friends' name is Howard from now on.

Selection page is like this

  • Each of the friends has it's own card-styled div, all of them shares the same class (cafe-card-box) with no id or name.

  • All the divs are in slider which is horribly coded. For some reason, divs are duplicated several times invisibly. There are 2-3 divs for Howard only.

  • Order of the divs are not same across users.

  • Name of the soldiers are in cafe-card-box(class) -> flex(class) -> profile-wrap(class) -> id(class) -> span(tag only). All the divs are same except for the content in .

  • Some randomly blank texts share the class="id". And the span tag not only has name but also how long the solider has been in army like this:

    Jacob (Been in for 2 weeks)

Initial approach

Initially, I wrote this code:

           cafes = self.driver.find_elements_by_class_name("cafe-card-box")
           for cafe in cafes:
            cf_name = cafe.find_element_by_class_name("id").text[0:3] #Almost every Korean names are 3 characters.
            if cf_name == soldier_name:
                 
                 print("found.")
                 cafe.find_element_by_link_text("위문편지").click()
                 break
            else:
                 print("It's not the one. Moving to the next ID class.")

This worked as expected, provided that the name somewhere in the div. The problem is that the program needs to work even when the name is wrong. I later tried this code:

       while n<=len(cafes):
           n = n + 1
           try:
                for cafe in cafes:
                     cf_name = cafe.find_element_by_class_name("id").text[0:3]
                     if cf_name == soldier_name:
                          print("Found!")
                          cafe.find_element_by_link_text("위문편지").click()
                          ps(3)
                          break
           except:
                print("Can't find anyone.")
                self.driver.quit()
                quit()

This downright didn't work. And in retrospect, first code that actually worked doesn't look so legit at all. I now want to loop through each card divs, find if the name is matching, change the frame to it if it does, and click the button in that specific div.

Is this possible? If so how? I feel like I've tried everything.

Side Question

Is there a better way to extract name from ?

cafe.find_element_by_class_name("id").text[0:3]

This doesn't seem so professional. All the names are separated via 1 blank space.

Edit

Adding HTML code.

                <div class="group">
                <div class="section-title bd_gradation">
                    <strong class="title">내 카페 <em>(2)</em></strong>
                </div>
                <div class="swiper-container cafe-slide-wrap swiper-container-horizontal" id="divSlide1">
                    <div class="swiper-wrapper" style="transition-duration: 0ms; transform: translate3d(-1140px, 0px, 0px);"><div class="swiper-slide swiper-slide-duplicate swiper-slide-duplicate-active swiper-slide-prev" data-swiper-slide-index="0">
                            
                                <!-- cafe-card-box -->
                                <div class="cafe-card-box">
                                    <div class="flex">
                                        <div class="photo-wrap" onclick="javascript:fn_selectListPost(1,&#39;20121590200&#39;,&#39;4737&#39;,&#39;0000140002&#39;);" style="cursor: pointer;">
                                            
                                                
                                                
                                                    <script>
                                                    var filedata = {
                                                         fileTypeCd : "0000210002"
                                                        ,thumb : thumbSizeMgr.unitMark
                                                        ,filePath : "/images/upload/20191122/nb3705@naver.com/"
                                                        ,savedFileNm : "20191122092608029_ge1"
                                                        ,extNm : "jpg"
                                                    };
                                                    document.write('<img src="'+fn_getFileSrcUrl(filedata)+'" alt="">');
                                                    </script><img src="./카페 메인_files/20191122092608029_ge1.jpg" alt="">
                                                
                                            
                                        </div>
                                        <div class="profile-wrap" onclick="javascript:fn_compMain(&#39;4737&#39;,&#39;20121590200&#39;);" style="cursor: pointer;">
                                            <div class="id"><!-- 최대 2줄 -->
                                                
                                                    <span>{NAME CENSORED} (입영 2주차)</span>
                                                
                                            </div>
                                            <div class="cafe-sh-txt"><!-- 최대 2줄 -->
                                                {PRIVATE INFO CENSORED}
                                            </div>
                                            <div class="cafe-sh-date"><!-- 최대 2줄 -->
                                                
                                                <span>입영일 <em> 2020.07.06 </em></span>
                                                
                                                <span>수료일 <em> 2020.08.12 </em></span>
                                            </div>
                                        </div>
                                    </div>
                                    <div class="btn-wrap">
                                        <a href="javascript:fn_consolLetter(&#39;4737&#39;,&#39;20121590200&#39;);" class="btn-green">위문편지</a>
                                        <a href="javascript:fn_compMain(&#39;4737&#39;,&#39;20121590200&#39;);" class="btn-blue">카페바로가기</a>
                                    </div>
                                </div>
                                <!-- //cafe-card-box -->
                                
                                <div class="cafe-card-box">
                                    <div class="flex">
                                        <div class="photo-wrap" onclick="javascript:fn_selectListPost(1,&#39;20020191700&#39;,&#39;4727&#39;,&#39;0000140001&#39;);" style="cursor: pointer;">
                                            
                                                
                                                
                                                    <script>
                                                    var filedata = {
                                                         fileTypeCd : "0000210002"
                                                        ,thumb : thumbSizeMgr.unitMark
                                                        ,filePath : "/images/upload/20200227/1234/"
                                                        ,savedFileNm : "20200227104858343_ge1"
                                                        ,extNm : "png"
                                                    };
                                                    document.write('<img src="'+fn_getFileSrcUrl(filedata)+'" alt="">');
                                                    </script><img src="./카페 메인_files/20200227104858343_ge1.png" alt="">
                                                
                                            
                                        </div>
                                        <div class="profile-wrap" onclick="javascript:fn_compMain(&#39;4727&#39;,&#39;20020191700&#39;);" style="cursor: pointer;">
                                            <div class="id"><!-- 최대 2줄 -->
                                                    <span>{NAME CENSORED} (입영 2주차)</span>
                                            </div>
                                            <div class="cafe-sh-txt"><!-- 최대 2줄 -->
                                                {PRIVATE INFO CENSORED}
                                            </div>
                                            <div class="cafe-sh-date"><!-- 최대 2줄 -->
                                                
                                                <span>입영일 <em> 2020.07.06 </em></span>
                                                
                                                <span>수료일 <em> 2020.08.11 </em></span>
                                            </div>
                                        </div>
                                    </div>
                                    <div class="btn-wrap">
                                        <a href="javascript:fn_consolLetter(&#39;4727&#39;,&#39;20020191700&#39;);" class="btn-green">위문편지</a>
                                        <a href="javascript:fn_compMain(&#39;4727&#39;,&#39;20020191700&#39;);" class="btn-blue">카페바로가기</a>
                                    </div>
                                </div>
                                <!-- //cafe-card-box -->
                            
                                
                                    
                                        </div>
                        
                            
                                <div class="swiper-slide swiper-slide-active swiper-slide-duplicate-next swiper-slide-duplicate-prev" data-swiper-slide-index="0">
                            
                                <!-- cafe-card-box -->
                                <div class="cafe-card-box">
                                    <div class="flex">
                                        <div class="photo-wrap" onclick="javascript:fn_selectListPost(1,&#39;20121590200&#39;,&#39;4737&#39;,&#39;0000140002&#39;);" style="cursor: pointer;">
                                            
                                                
                                                
                                                    <script>
                                                    var filedata = {
                                                         fileTypeCd : "0000210002"
                                                        ,thumb : thumbSizeMgr.unitMark
                                                        ,filePath : "/images/upload/20191122/nb3705@naver.com/"
                                                        ,savedFileNm : "20191122092608029_ge1"
                                                        ,extNm : "jpg"
                                                    };
                                                    document.write('<img src="'+fn_getFileSrcUrl(filedata)+'" alt="">');
                                                    </script><img src="./카페 메인_files/20191122092608029_ge1.jpg" alt="">
                                                
                                            
                                        </div>
                                        <div class="profile-wrap" onclick="javascript:fn_compMain(&#39;4737&#39;,&#39;20121590200&#39;);" style="cursor: pointer;">
                                            <div class="id"><!-- 최대 2줄 -->
                                                
                                                    <span>{NAME CENSORED} (입영 2주차)</span>
                                                
                                            </div>
                                            <div class="cafe-sh-txt"><!-- 최대 2줄 -->
                                                {PRIVATE INFO CENSORED}
                                            </div>
                                            <div class="cafe-sh-date"><!-- 최대 2줄 -->
                                                
                                                <span>입영일 <em> 2020.07.06 </em></span>
                                                
                                                <span>수료일 <em> 2020.08.12 </em></span>
                                            </div>
                                        </div>
                                    </div>
                                    <div class="btn-wrap">
                                        <a href="javascript:fn_consolLetter(&#39;4737&#39;,&#39;20121590200&#39;);" class="btn-green">위문편지</a>
                                        <a href="javascript:fn_compMain(&#39;4737&#39;,&#39;20121590200&#39;);" class="btn-blue">카페바로가기</a>
                                    </div>
                                </div>
                                <!-- //cafe-card-box -->
                            
                                
                                    
                                
                                
                            
                        
                            
                                <!-- cafe-card-box -->
                                <div class="cafe-card-box">
                                    <div class="flex">
                                        <div class="photo-wrap" onclick="javascript:fn_selectListPost(1,&#39;20020191700&#39;,&#39;4727&#39;,&#39;0000140001&#39;);" style="cursor: pointer;">
                                            
                                                
                                                
                                                    <script>
                                                    var filedata = {
                                                         fileTypeCd : "0000210002"
                                                        ,thumb : thumbSizeMgr.unitMark
                                                        ,filePath : "/images/upload/20200227/1234/"
                                                        ,savedFileNm : "20200227104858343_ge1"
                                                        ,extNm : "png"
                                                    };
                                                    document.write('<img src="'+fn_getFileSrcUrl(filedata)+'" alt="">');
                                                    </script><img src="./카페 메인_files/20200227104858343_ge1.png" alt="">
                                                
                                            
                                        </div>
                                        <div class="profile-wrap" onclick="javascript:fn_compMain(&#39;4727&#39;,&#39;20020191700&#39;);" style="cursor: pointer;">
                                            <div class="id"><!-- 최대 2줄 -->
                                                
                                                    <span>{NAME CENSORED} (입영 2주차)</span>
                                                
                                            </div>
                                            <div class="cafe-sh-txt"><!-- 최대 2줄 -->
                                                {PRIVATE INFO CENSORED}
                                            </div>
                                            <div class="cafe-sh-date"><!-- 최대 2줄 -->
                                                
                                                <span>입영일 <em> 2020.07.06 </em></span>
                                                
                                                <span>수료일 <em> 2020.08.11 </em></span>
                                            </div>
                                        </div>
                                    </div>
                                    <div class="btn-wrap">
                                        <a href="javascript:fn_consolLetter(&#39;4727&#39;,&#39;20020191700&#39;);" class="btn-green">위문편지</a>
                                        <a href="javascript:fn_compMain(&#39;4727&#39;,&#39;20020191700&#39;);" class="btn-blue">카페바로가기</a>
                                    </div>
                                </div>
                                <!-- //cafe-card-box -->
                            
                                
                                    
                                        </div>

2 Answers2

0

You can find all div elements containing certain text:

from selenium import webdriver

driver = webdriver.Chrome()
# Some code...
divList = [div for div in driver.find_elements_by_tag_name('div') if 'The text to find' in div.get_attribute('innerText')]
Ali Sajjad
  • 3,589
  • 1
  • 28
  • 38
0

Don't know if this could be useful for you, but you also could use XPATH, like this:

from selenium import webdriver

driver = webdriver.Chrome()
# Some code...
elementList = driver.find_elements_by_xpath('//div[contains(@class,"profile-wrap")]/div[@class="id"]/span[contains(text(),'NAME')]')

Please be aware of this facts:

  • XPATH is the slowest "find_" method among others, use it if you have no other choice or the project don't care so much about performances
  • XPATH cannot perform a case-insensitive search, so you have to make a translation (see here https://stackoverflow.com/a/8474109/3228768) and maybe it's not suitable for the chars you need to find
  • XPATH could easily perform a selection of the ancestors appending /.. to the query

NOTE:

  1. as you can see I used two different conditions to target a div by class. One is with contains() and the other is without it. The difference is that in the second form, the target is matched only if the target has the class name as unique value of the "class" attribute.
  2. you can extract text from the elements returned by the xpath in order to obtain a realiable text extraction
Mattia Galati
  • 2,415
  • 16
  • 22