0

I have the following code:

import requests, pandas as pd
from bs4 import BeautifulSoup
s = requests.session()
url2 = r'https://www.har.com/homedetail/6408-burgoyne-rd-157-houston-tx-77057/3380601'
r = s.get(url2)
soup = BeautifulSoup(r.text, 'html.parser')    
z2 = soup.find_all("div", {"class": 'dc_blocks_2c'})

z2 returns a long list. How do I get all the variables and values in a dataframe? i.e. gather the dc_label and dc_value pairs.

smci
  • 32,567
  • 20
  • 113
  • 146
Zanam
  • 4,607
  • 13
  • 67
  • 143
  • 1
    because it was irreproducible, until you supplied the missing bits I retracted the close-vote now. – smci May 15 '21 at 02:08
  • You need a list comprehension. Also related: [Beautiful Soup find children for particular div](https://stackoverflow.com/questions/13202087/beautiful-soup-find-children-for-particular-div) – smci May 15 '21 at 02:15

2 Answers2

1
pd.DataFrame([el.find_all('div', {'dc_label','dc_value'}) for el in z2])

                               0                                                  1
0                        [MLS#:]                                  [30509690 (HAR) ]
1               [Listing Price:]  [$ 248,890 ($151.76/sqft.) , [], [$Convert ], ...
2              [Listing Status:]  [[\n, [\n, <span class="status_icon_1" style="...
3                     [Address:]                          [6408 Burgoyne Road #157]
4                    [Unit No.:]                                              [157]
5                        [City:]                                        [[Houston]]
6                       [State:]                                               [TX]
7                    [Zip Code:]                                          [[77057]]
8                      [County:]                                  [[Harris County]]
9                 [Subdivision:]  [ , [Briarwest T/H Condo (View subdivision pri...
smci
  • 32,567
  • 20
  • 113
  • 146
1

when reading tables, it's sometimes easier to just use read_html() method. If it doesn't capture everything you want you can code for the other stuff. Just depends on what you need from the page.

url = 'https://www.har.com/homedetail/6408-burgoyne-rd-157-houston-tx-77057/3380601'
list_of_dataframes = pd.read_html(url)
for df in list_of_dataframes:
    print(df)

or get df by position in list. for example,

df = list_of_dataframes[2]

All dataframes captured:

                      0           1
0  Original List Price:    $249,890
1        Price Reduced:     -$1,000
2   Current List Price:    $248,890
3    Last Reduction on:  05/14/2021
                      0           1
0  Original List Price:    $249,890
1        Price Reduced:     -$1,000
2   Current List Price:    $248,890
3    Last Reduction on:  05/14/2021
   Tax Year Cost/sqft Market Value  Change Tax Assessment Change.1
0      2020   $114.36     $187,555  -4.88%       $187,555   -4.88%
1      2019   $120.22     $197,168  -9.04%       $197,168   -9.04%
2      2018   $132.18     $216,768   0.00%       $216,768    0.00%
3      2017   $132.18     $216,768   5.74%       $216,768    9.48%
4      2016   $125.00     $205,000   2.19%       $198,000    6.90%
5      2015   $122.32     $200,612  18.71%       $185,219   10.00%
6      2014   $103.05     $169,000  10.40%       $168,381   10.00%
7      2013    $93.34     $153,074   0.00%       $153,074    0.00%
8      2012    $93.34     $153,074     NaN       $153,074      NaN
                           0         1
0         Market Land Value:   $39,852
1  Market Improvement Value:  $147,703
2        Total Market Value:  $187,555
                             0         1
0                 HOUSTON ISD:  1.1367 %
1               HARRIS COUNTY:  0.4071 %
2       HC FLOOD CONTROL DIST:  0.0279 %
3   PORT OF HOUSTON AUTHORITY:  0.0107 %
4            HC HOSPITAL DIST:  0.1659 %
5  HC DEPARTMENT OF EDUCATION:  0.0050 %
6   HOUSTON COMMUNITY COLLEGE:  0.1003 %
7             HOUSTON CITY OF:  0.5679 %
8              Total Tax Rate:  2.4216 %
                                                                          0            1
0  Estimated Monthly Principal & Interest  (Based on the calculation below)        $ 951
1            Estimated Monthly Property Tax  (Based on Tax Assessment 2020)        $ 378
2                                                     Home Owners Insurance  Get a Quote
Jonathan Leon
  • 5,440
  • 2
  • 6
  • 14