I have an anchor tag as follows:
<a class="gsc_a_at" href= "/citations?view_op=view_citation&hl=en&user=11JgipcAAAAJ&pagesize=100&citation_for_view=11JgipcAAAAJ:j3f4tGmQtD8C">'''
I want to extract the content after the citation_for_view
using beautifulSoup
. How can I do it without regular expressions
.
Below is what I tried.
input_data = ''' '''
#!/usr/bin/python
from bs4 import BeautifulSoup
soup = BeautifulSoup(input_data)
for href_tags in soup.find_all('a',href=True):
print href_tags['href']
This outputs:
/citations?view_op=view_citation&hl=en&user=11JgipcAAAAJ&pagesize=100&citation_for_view=11JgipcAAAAJ:j3f4tGmQtD8C
How can I extract the content of citation_for_view
which is within href
and output just 11JgipcAAAAJ:j3f4tGmQtD8C