Extract items out of each List element and populate Dataframe with it

Question

This generates a result set as a list

#Dates
from bs4 import BeautifulSoup as bs
import pandas as pd
pd.set_option('display.max_colwidth', 500)
import requests
myURL =  "xxxxx"
page = requests.get(myURL)
#print (page)
soup = bs(page.content,"html.parser")
#print(soup.prettify)
rSet = soup.find_all("td", class_="first")
for el in rSet :
 print (el.find("first")) <-- returns "None"
 print (el) <-- returns <td class="first" rowspan="1">00:00 - 01:00</td> (for eaxmple)

With elements that look like this:

<td class="first" rowspan="1">00:00 - 01:00</td>
<td class="first" rowspan="1">01:00 - 02:00</td>

I Want to extract "00:00" and "01:00" (- which are Start and End Times) and populate the Dataframe in two columns. What would be the best way to achieve this?

score 1 · Accepted Answer · answered Aug 24 '21 at 15:07

1

Have you tried?:

print(el.text)

answered Aug 24 '21 at 15:07

dimelu

68
3

It worked, is it a HTML parser? I now get `00:00 - 01:00` when i print it, do you know what the reason is? – OldNick Aug 24 '21 at 15:13
1

.text it's a BeautifulSoup property and gets all the child strings. Return a unicode object. For more: [look at this](https://stackoverflow.com/questions/25327693/difference-between-string-and-text-beautifulsoup) – dimelu Aug 24 '21 at 15:30

Extract items out of each List element and populate Dataframe with it

1 Answers1