I use the code below to read tables from websites. With the first example everything works as expected. with the second example (commented variables) I only get the first column. I don't find the reason for it. Can somebody help here?
Also nice would be a simple ways to create a nicer output of the tables.
import urllib2
import pprint
from bs4 import BeautifulSoup
URL = 'http://www.proplanta.de/Markt-und-Preis/MATIF-Raps/'
TABLENR = 36
#URL = 'http://www1.chineseshipping.com.cn/en/indices/ccfinew.jsp'
#TABLENR = 4
req = urllib2.Request(URL, headers={'User-Agent' : "My Browser"})
con = urllib2.urlopen( req )
html = con.read()
soup = BeautifulSoup(html)
tables = soup.find_all('table')
data = []
rows = tables[TABLENR].find_all('tr')
for row in rows:
cols = row.find_all('td')
cols = [ele.text.strip() for ele in cols]
data.append([ele for ele in cols if ele]) # Get rid of empty values
pprint.pprint (data)