0

I've been trying to extract the text from the following code with BeautifulSoup in Python:

<a class="w-menu__link" href="https://www.universidadviu.es/grado-economia/">Grado en Economía</a>

I need to extract the text "Grado en Economía" from this and all other similar lines in the html code. For example:

<a class="w-menu__link" href="https://www.universidadviu.es/grado-derecho/">Grado en Derecho</a>

In this line I need to extract "Grado en Derecho".

I can extract the class and the href, but I don't know how to extract the rest of the text. I'm using the following code:

list_of_links_graus = []

html_graus = urlopen("https://www.universidadviu.es/grados-online-viu/") # Insert your URL to extract
bsObj_graus = BeautifulSoup(html_graus.read());

for link in bsObj_graus.find_all('a'):
    list_of_links_graus.append(link.get('href'))

I would also ask if someone can please edit the title of this question in order to fit the real problem, since I'm not a html expert and I suppose I'm not extracting a simple text (as the title says).

Thanks to all in advance.

  • Possible duplicate of [Python: BeautifulSoup extract text from anchor tag](https://stackoverflow.com/questions/11716380/python-beautifulsoup-extract-text-from-anchor-tag) – Tobey Jul 25 '18 at 09:20

1 Answers1

1

Use the text attribute

for link in bsObj_graus.find_all('a'):
    list_of_links_graus.append((link.get('href'), link.text))
Tobey
  • 1,400
  • 1
  • 10
  • 25