-1

I scrape data from website and I used a variable to contain all lists of data.

i.e.

x = [a,d,d,d] [a,f,f,f] [a,f,g,r]

How can I put them into one list like following:

x = [[a,d,d,d],[a,f,f,f],[a,f,g,r]]

The code I used:

from urllib.request import urlopen
from bs4 import BeautifulSoup
import csv

url = "http://www.dicj.gov.mo/web/en/information/DadosEstat_mensal/2018/report_en.xml?id=5"
html = urlopen(url)
soup = BeautifulSoup(html, 'xml')

my_data = soup.find_all('RECORD')[0:]
for tds in my_data:
   x = [i.get_text() for i in tds.find_all('DATA')]
   print(x)
khelwood
  • 55,782
  • 14
  • 81
  • 108
  • By the way, in your code, the `[0:]` in `my_data = soup.find_all('RECORD')[0:]` does not serve any purpose so you can remove it. It essentially means select all list elements from the zero-th index to the end of the list. – Abhinav Sood Nov 29 '18 at 15:12
  • @AbhinavSood Thanks for reminder! – Framer B Nov 29 '18 at 15:26
  • @TimCastelijns I read that question before but I do not understand at that time. Since I am new in here, should I delete this question if it is dup? – Framer B Nov 29 '18 at 15:28
  • @FramerB You are under no obligation to delete your question, especially if you didn't find the linked duplicate useful. If enough other people vote that it is a duplicate, it will be automatically marked as one. – khelwood Nov 29 '18 at 15:41
  • @khelwood Got it ! Thanks – Framer B Nov 30 '18 at 04:52

1 Answers1

1

So you have a new x each time through your for loop. You can append each x to a list.

xs = []
for tds in my_data:
    x = [i.get_text() for i in tds.find_all('DATA')]
    print(x)
    xs.append(x)

Or you can use a list comprehension and do it all in one line.

xs = [[i.get_text() for i in tds.find_all('DATA')] for tds in my_data]
khelwood
  • 55,782
  • 14
  • 81
  • 108
  • surely there is a dupe for this – Tim Nov 29 '18 at 15:00
  • For the first suggestion, when I try to write : print(xs.append(x)) , the output is all 'NONE' But the second one works! – Framer B Nov 29 '18 at 15:07
  • They both work. They produce the same final list `xs`. There's no point trying to print `xs.append(x)` since `list.append` always returns None. – khelwood Nov 29 '18 at 15:09