0

All -

For a Selenium webscraper using Python 3.x - I am trying to get a printout that depends upon the length of each 1d index in a jagged list, in this case only 2d. The list is named masterViewsList, and the lists it contains are versions of a list named viewsList. Below see how my list masterViewsList of viewsList's is constructed:

from selenium import webdriver
import os

masterLinkArray = []
masterViewsList = []

# a bunch of code here that I cut out for simplicity's sake

for y in range(0, len(masterLinkArray)):
    browser = webdriver.Chrome(chromePath)
    viewsList = []
    browser.get(masterLinkArray[y])
    productViews = browser.find_elements_by_xpath("// *[ @ id = 'lightSlider'] / li / img")
    counter = - 1
    for a in productViews:
        counter = counter + 1
        viewsList.append(a.get_attribute('src'))
        print(viewsList[counter])
        print(len(viewsList))
    masterViewsList.append(viewsList)
    if y == 10:
        print(masterViewsList[y])
        print(len(masterViewsList[y]))
    del viewsList[:]

print(len(masterLinkArray))
print(len(masterViewsList))
print(len(masterViewsList[0]))
print(len(masterViewsList[1]))
print(len(masterViewsList[10]))

The printout is this:

["https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/544_PM2_Front%20view.jpg?wid=140&hei=140","https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/544_PM1_Side%20view.jpg?wid=140&hei=140","https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/544_PM1_Interior%20view.jpg?wid=140&hei=140","https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/544_PM1_Other%20view.jpg?wid=140&hei=140","https://us.testcompany.com/images/is/image/lv/1/PP_VP_L/544_PM1_Other%20view2.jpg?wid=140&hei=140"]
5
79
79
0
0
0

As you can see, neither the masterLinkArray, nor the masterViewsList are empty - they're 79 long. Also, print(masterViewsList[y]) prints out an actual non-empty list, one with a recognized length of 5. Oddly, once I leave the for y loop, len(masterViewsList[*any integer*]) prints out to "0". These similar questions: Find the dimensions of a multidimensional Python array, Find length of 2D array Python, both indicate that len(array[*integer*]) is the proper way to get the length of a list within a list, but in my case this appears not to be working consistently.

OrangeOwner
  • 17
  • 10
  • I think the adresses wil help to solve your problem. With this information more people will be interested in solving your problem. – Frank Apr 12 '18 at 06:53
  • @Frank, thanks for the tip - makes sense to me. Some relevant links are now used in an example above. – OrangeOwner Apr 12 '18 at 14:23
  • Do you think this is only a python problem or does Selenium also play a role in this question? Adding the rest of the code can help me find out what is wrong with it. – Frank Apr 13 '18 at 08:06
  • @Frank, sure thing! The code is in there now, at the bottom. I appreciate the offer! This is all a work in progress (AND my first program of this sort in Python), so please excuse all the 'extra'! Happy to clarify/answer any questions - and suggestions of any constructive sort are welcome! – OrangeOwner Apr 13 '18 at 23:09

1 Answers1

0

The masterViewlist is empty a the point where you call the len method on it. That's why all the results of the len method will be zero.

In the first short version of your code this is not clear because the following line of code is missing in that version:

del viewsList[:]

After appending the viewList to masterViewList, this line of code causes the masterViewlist being empty again. This is because the del command deletes all references to this viewList including the one in the masterViewlist. You can remove this del of the viewList because you starts with a new viewList every time you are back in the beginning of the outer for loop.

Frank
  • 831
  • 1
  • 11
  • 23
  • removing `del viewsList[:]` line worked. Thanks! The documentation for `del` says: "...Deletion of a name removes the binding of that name from the local or global namespace, depending on whether the name occurs in a global statement in the same code block. If the name is unbound, a NameError exception will be raised..." So you're saying this translates to 'all instances of the variable (even past) are deleted, not only the current/most recent instance,' ya? I tested this interpretation elsewhere in my code, and it seems to be the case, but I wanna make sure I'm getting this right. – OrangeOwner Apr 14 '18 at 17:20
  • Also, should I change the title of this post to detail the issue was with the `del` statement? Is it policy to do this if the ultimate answer/topic ends up different than the original question/post expected? – OrangeOwner Apr 14 '18 at 17:46