0

just recently learning web scraping using python 3 and beautifulsoup. I have problem to print the only row i want.

Below i provide the code i use.

product_sizes = view_product.find('dl', id='dl_1')
for product_size in product_sizes.find_all('li'):
    product_size = product_size.span.text
    print(product_size)

Suppose when i print this, i got this kind of result

35
36
37
38
39
40

I want to let say print the 2nd row. the "36". How do i do that? I tried [] on

    product_size = product_size.span.text[0]

but what i got is

3
3
3
3
3
4

I expect when i print, i got something like this

36

Thanks. Got the feeling this is newb question but i do google around without success.

Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
  • 2
    no loop: `print(product_sizes.find_all('li')[0].text)` - make sure / test if it delivers enough results so you do not get an IndexError -your code prints the 0st character of one result where you want to print the 0st result's `.text` – Patrick Artner Jan 19 '19 at 12:06
  • 4
    Possible duplicate of [Understanding slice notation](https://stackoverflow.com/questions/509211/understanding-slice-notation) – Patrick Artner Jan 19 '19 at 12:08

4 Answers4

1

Do this:

product_sizes = view_product.find('dl', id='dl_1')
c = 0 

for product_size in product_sizes.find_all('li'):
    if c == 1: 
        print(product_size.span.text)
    c = c + 1

This gives you the desired output you're looking for:

36
Employee
  • 3,109
  • 5
  • 31
  • 50
1

product_size = product_size.span.text[0] will output the character in the 1st position of a string, hence you are getting 3, 3, 3, 3, 3, 4, instead of 35, 36, 37, 38, 39, 40

There is no need to do a for loop. If you want the 2nd element from your product_sizes.find_all('li'), you simply just need to call that position with product_sizes.find_all('li')[1]

You can do this in fewer lines of code as below, but just to show the logic...

#Get all elements in view_product dl, id='dl_1'
product_sizes = view_product.find('dl', id='dl_1')

# From product_sizes, find all the 'li' tags and choose the 2nd element
product_size = product_sizes.find_all('li')[1]

# Get the text
product_size = product_size.span.text

# print the text
print(product_size)
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
chitown88
  • 27,527
  • 4
  • 30
  • 59
0

You probably don't need a loop to achieve what you are looking for.

findall() #It returns a list

You can just do like

product_sizes.find_all('li')

Which returns the list as output then you can slice according to your requirement. For instance, Say 2nd Element then

print(product_sizes.find_all('li')[1].text)

Finally, your code will look like following

product_sizes = view_product.find('dl', id='dl_1')
print(product_sizes.find_all('li')[1].text) #Prints second element.

Output:

36
0

Thanks all for you input. I tried all and get good answer. Seem simple enough. The reason i want this because i want to print it in csv in one row manner and whenever its got error, i want it to leave blank so it give room to other data, as in spreadsheet fashion. But that is different problem for different day. Want to study 1st then later if i still stuck, will ask in new thread(?).

Btw, Below is the code i write from the knowledge i gain from every answer you guys give here.

product_sizes = view_product.find('dl', id='dl_1')
product_size01 = product_sizes.find_all('li')[0].text.replace('\r', '').replace('\n', '').replace(" ","")
product_size02 = product_sizes.find_all('li')[1].text.replace('\r', '').replace('\n', '').replace(" ","")
product_size03 = product_sizes.find_all('li')[2].text.replace('\r', '').replace('\n', '').replace(" ","")
product_size04 = product_sizes.find_all('li')[3].text.replace('\r', '').replace('\n', '').replace(" ","")
product_size05 = product_sizes.find_all('li')[4].text.replace('\r', '').replace('\n', '').replace(" ","")
product_size06 = product_sizes.find_all('li')[5].text.replace('\r', '').replace('\n', '').replace(" ","")
product_size07 = product_sizes.find_all('li')[6].text.replace('\r', '').replace('\n', '').replace(" ","")
product_size08 = product_sizes.find_all('li')[7].text.replace('\r', '').replace('\n', '').replace(" ","")
product_size09 = product_sizes.find_all('li')[8].text.replace('\r', '').replace('\n', '').replace(" ","")
product_size10 = product_sizes.find_all('li')[9].text.replace('\r', '').replace('\n', '').replace(" ","")

Thanks you guys for fast answer and awesome community.