0

I'm searching for a solution

I try to get the images from: 
<pics>
<pic>240006.jpg</pic>
<pic>240006_2.jpg</pic>
</pics>

Wel with this code:

for x in root.iter('product'):
    pics =x.findall('pics/pic')

    images = "https://cdn.edc-internet.nl/800/" + pics[0].text + ";" + "https://cdn.edc-internet.nl/800/" + pics[1].text + ";" + "https://cdn.edc-internet.nl/800/" + pics[2].text
print(images)

With some product having 2 images, it creates a "List out of range" I want to check if there is a value if not the let online see the 2 or even 1 image link.

I have tried it with an if statement That failed, the i had try it with an try: But that gives me only the value with 3 pictures

MattDMo
  • 100,794
  • 21
  • 241
  • 231
  • Is there a reason why you wrote 3 times your link construction code? One easy solution would be to iterate on `pics` and build the URL as you go, reconstructing everything with a `";".join(iterable)` a the end – Wonskcalb Dec 10 '20 at 15:00

2 Answers2

4

Since you might not know how many elements you have in pics, a better way would be to iterate over it, creating the URLs on the fly. This would avoid using fixed indexes that could, as you see, break if pics length is less than 3, and return all of the items if you ended up scraping more than 3.

for x in root.iter('product'):
    pics = x.findall('pics/pic')

    URL = "https://cdn.edc-internet.nl/800/%s"
    images = ";".join(URL % picture for picture in pics)

    print(images)
Wonskcalb
  • 383
  • 2
  • 6
0

Try with this :

for x in root.iter('product'):
    pics=x.findall('pics/pic')

    for i in range(len(pics)):
        images=images+"https://cdn.edc-internet.nl/800/"+pics[i].text+";"
    print(images)
Mike
  • 342
  • 2
  • 6
  • 1
    Joining strings like this should be avoided as it's generally worst-performing than constructing it live. Check my answer below – Wonskcalb Dec 10 '20 at 15:05
  • 1
    Turns out I found this https://stackoverflow.com/questions/12169839/which-is-the-preferred-way-to-concatenate-a-string-in-python, here they say that using "".join isn't faster than +=, has this changed in python3? – Mike Dec 10 '20 at 15:11
  • The difference looks pretty stark to me. `timeit("';'.join(str_list)", setup="str_list = map(str, range(10_000))") # 0.16372650700213853` vs. `timeit("res = ""; for x in str_list: res += x", setup="str_list = map(str, range(10_000))") # 0.030358272000739817` From what I remember, you gain both in raw performance and in RAM usage – Wonskcalb Dec 10 '20 at 15:36
  • The difference in performance comes from the fact that using `+` will create many strings, one per loop actually (since those are immutable). You would then have something like `(((a+b)+c)+d)`. This would re-read the first item n-1 times, the second, n-2 times, etc – Wonskcalb Dec 10 '20 at 15:44