-1

I'm crawling webpage by python BeautifulSoup, requests, Pandas library, trying to collect information of many items in many pages by for loop. But when I run these code, I could only get lists separated from each others, so I want to edit this code to concatenated by one list.

  • Well, actually my problem is not 'concatenating list' itself. I know it already, but the problem is, In case of the fuction gives a result that 'list' one by one, how can I edit the code to make a result that gives 'one list' that concatenated all together, Or return [[list],[list],[list]] form that I can easily concatenate all together.

Windows, Jupyter Notebook, Python

def a(content):
    ptag_title=content.find("p",{"class":"title"})
    ptag_price=content.find("p",{"class":"price-sale"})
    return {"title":ptag_title.text, "price":ptag_price.text}

def get_pd_page(url):
    result = requests.get(url)
    bs_obj = bs4.BeautifulSoup(result.content,"html.parser")
    pbl=bs_obj.find("div",{"class":"product-box-list"})
    contents = pbl.findAll("div",{"class":"content"})
    pdinfo_list = [get_pdinfo(content ) for content in contents]
    return pdinfo_listn = 10

urls = [None] * n
fix_str = "https://www.abcdef.com"

for page_num in range(0,n):
    page_str = fix_str + str(page_num+1)
    urls[page_num] = page_str
    page_products = get_pd_page(urls[page_num])
    print(page_products)

result for each pages are separated lists.

[{'title':a, 'price'=b},{'title':c, 'price'=d}] [{'title':d, 'price'=e},{'title':f, 'price'=g]

I want to make this whole one list.

[{'title':a, 'price'=b},{'title':c, 'price'=d},{'title':d, 'price'=e},{'title':f, 'price'=g]

Or, at least, by lists of lists

[[{'title':a, 'price'=b},{'title':c, 'price'=d}],[{'title':d, 'price'=e},{'title':f, 'price'=g]]

1 Answers1

1

Use the + operator to concatenate any number of lists

In [19]: li1 = [1,2,3]                                                                                                         

In [20]: li2 = [4,5,6]                                                                                                         

In [21]: li1+li2                                                                                                               
Out[21]: [1, 2, 3, 4, 5, 6]

Or use a list comprehension to concatenate the sublists inside a list of lists, also called flattening a list

In [23]: li = [[1,2,3],[4,5,6],[7,8,9]]  

In [30]: flat_list = [item for sublist in li for item in sublist]                                                              

In [31]: flat_list                                                                                                             
Out[31]: [1, 2, 3, 4, 5, 6, 7, 8, 9]

These are simpler examples then what you are trying to achieve, but a similar approach will solve the problem you have at end!

Devesh Kumar Singh
  • 20,259
  • 5
  • 21
  • 40