0

All, I am trying to create a jagged list in Python 3.x. Specifically, I am pulling a number of elements from a list of webpages using Selenium. Each row of my jagged list ("matrix") represents the contents of one of these said webpages. Each of these rows should have as many columns as there are elements pulled from its respective webpage - this number will vary from page to page.

e.g.

webpage1 has 3 elements: a,b,c
webpage2 has 6 elements: d,e,f,g,h,i
webpage3 has 4 elements: j,k,l,m
...

would look like:

[[a,b,c],
[d,e,f,g,h,i],
[j,k,l,m],...]

Here's my code, thus far:

from selenium import webdriver

chromePath = "/Users/me/Documents/2018/chromedriver"
browser = webdriver.Chrome(chromePath)

url = 'https://us.testcompany.com/eng-us/women/handbags/_/N-r4xtxc/to-1'
browser.get(url)

hrefLinkArray = []

hrefElements = browser.find_elements_by_class_name("product-item")

for eachOne in hrefElements:
    hrefLinkArray.append(eachOne.get_attribute('href'))

pics = [[]]

for y in range(0, len(hrefLinkArray)): # or type in "range(0, 1)" to debug
    browser.get(hrefLinkArray[y])
    productViews = browser.find_elements_by_xpath("// *[ @ id = 'lightSlider'] / li")
    b = -1
    for a in productViews:
        b = b + 1
        # print(y) for debugging
        # print(b) for debugging
        pics[y][b] = a.get_attribute('src') # <------------ ERROR!
        # pics[y][b].append(a.get_attribute('src') GIVES SAME ERROR AS ABOVE
    del productViews[:]

browser.quit()

Whenever I run this, I get an error on the first iteration of the a in productViews loop:

line 64, in <module>
    pics[y][b] = a.get_attribute('src')
IndexError: list assignment index out of range

From what I can tell, the the integer references are correct (see my debugging lines in the for a in productViews loop), so pics[0][0] is a proper way to reference the jagged list. This being said, I have a feeling pics[0][0] does not yet exist? Or maybe only pics[0] does? I've seen similar posts about this error, but the only solution I've understood seems to be using .append(), and even as such, using this on a 1D list. As you can see in my code, I've used .append() for the hrefLinkArray successfully, whereas it appears unsuccessful on line 64/65. I'm stumped as to why this might be.

Please let me know:

  1. Why my lines .append() and [][]=... are throwing this error.

  2. If there is a more efficient way to accomplish my goal, I'd like to learn!

UPDATE: using @User4343502's answer, in conjunction with @StephenRauch's input, the error resolved and I now and getting the intended-sized jagged list! My amended code is:

listOfLists = []

for y in range(0, len(hrefLinkArray)):
    browser.get(hrefLinkArray[y])

    productViews = browser.find_elements_by_xpath("// *[ @ id = 'lightSlider'] / li")
    otherList = []
    for other in productViews:
        otherList.append(other.get_attribute('src'))
        # print(otherList)
    listOfLists.append(otherList)
    del otherList[:]
    del productViews[:]

print(listOfLists)

Note, this code prints a jagged list of totally empty indices e.g. [[][],[][][][],[],[][][],[][],[][][][][]...], but that is a separate issue - I believe related to my productViews object and how it retrieves by xpath... What's important, though, is that my original question was answered. Thanks!

OrangeOwner
  • 17
  • 10
  • Possible duplicate of [Two dimensional array in python](https://stackoverflow.com/questions/8183146/two-dimensional-array-in-python) – Stephen Rauch Mar 30 '18 at 20:27
  • 1
    This is a lot of code, it would help if you stripped it down to only the relevant parts (the sections regarding lists) – touch my body Mar 30 '18 at 20:34
  • @StephenRauch, your attached 'solution' does not answer why my code on line 65 (commented out, below the error line) does not work. I will try the gist of the work-around it suggests, but even if that works (I'll report back on this), I don't understand why my previous attempts don't work... it may be a solution, but not an answer. Does this make sense? – OrangeOwner Mar 30 '18 at 20:39
  • 1
    The answer explains the error, which is that you cannot address a list element that is not already there. – Stephen Rauch Mar 30 '18 at 20:41
  • @StephenRauch. So `pics[0][0]` does not yet exist, even though I declared `pics[[]]` ? If this is the case, then what is `pics[[]]`? If this is the case, then using somehting like `pics.append(someOtherArrayContainingThePageLinks)`, is what your attached link is suggesting? – OrangeOwner Mar 30 '18 at 20:48
  • `[[]]` is a list which contains an empty list. The inner empty list has no element 0. Try `pics[0].append(...)` – Stephen Rauch Mar 30 '18 at 20:49
  • @StephenRauch alright - thanks for clarifying! I – OrangeOwner Mar 30 '18 at 21:12

1 Answers1

0

list.append will add an element into a list. This works regardless of what the element is.

a = [1, 2, 3]
b = [float, {}]
c = [[[None]]]

## We will append to this empty list
list_of_lists = []

for x in (a, b, c):
    list_of_lists.append(x)

## Prints: [[1, 2, 3], [<type 'float'>, {}], [[[None]]]]
print(list_of_lists)

Try it Online!

touch my body
  • 1,634
  • 22
  • 36