0

I try to crawler the tables from this link, I have get position of table content by using F12 inspect.

enter image description here

I have use the follow code, but I get None result, someone could help? Thanks.

import requests
import json
import pandas as pd
import numpy as np
from bs4 import BeautifulSoup

url = 'http://bjjs.zjw.beijing.gov.cn/eportal/ui?pageId=308894'

website_url = requests.get(url).text
soup = BeautifulSoup(website_url, 'lxml')
table = soup.find('table', {'class': 'gridview'})
#table = soup.find('table', {'class': 'criteria'})

print(table)

Please also check this reference, in fact, I want do the similar things here, but the web structure seems different.

Updated: The following code works for one page, but I need to loop other pages as well.

import requests
import json
import pandas as pd
import numpy as np
from bs4 import BeautifulSoup

url = 'http://bjjs.zjw.beijing.gov.cn/eportal/ui?pageId=308894'

website_url = requests.get(url).text
soup = BeautifulSoup(website_url, 'lxml')
table = soup.find('table', {'class': 'gridview'})
#https://stackoverflow.com/questions/51090632/python-excel-export
df = pd.read_html(str(table))[0]
df.to_excel('test.xlsx', index = False)

Output:

   序号  ...      竣工备案日期
0   1  ...  2020-01-22
1   2  ...  2020-01-22
2   3  ...  2020-01-22
3   4  ...  2020-01-22
4   5  ...  2020-01-22

[5 rows x 9 columns]

Reference related:

https://medium.com/analytics-vidhya/web-scraping-wiki-tables-using-beautifulsoup-and-python-6b9ea26d8722

ah bon
  • 9,293
  • 12
  • 65
  • 148

1 Answers1

1

You can get elements in <tr>... </tr> tags like :

table = soup.find_all('table', {'class': 'gridview'})

for elements in table:
    inner_elements = elements.findAll('tr')[1:]

    for text_for_elements in inner_elements:
        print(text_for_elements.text)

OUTPUT :

1
朝阳区东三环北路38号院4号楼3层301室内局部装修工程
威沃克办公服务(北京)有限公司
袁永懿
上海东园建筑装饰有限公司
陈振华
0065朝竣2020(装)0053号
北京市朝阳区住房和城乡建设委员会
2020-01-22
2
北京市朝阳区新源南路3号14层04单元A1704室内装修工程
重庆金融资产交易所有限责任公司
罗珊珊
深圳安星建设集团有限公司
张惠富
0066朝竣2020(装)0054号
北京市朝阳区住房和城乡建设委员会
2020-01-22
......
Omer Tekbiyik
  • 4,255
  • 1
  • 15
  • 27