You cannot use regular expressions in XPath 1.0 (even if regexes would surely be useful there!). In XPath 2.0 (which lxml does not support), regexes can be used in some functions, for instance matches()
or replace()
.
If I understood correctly, you are looking for this piece of data:
<a href='/institute/event/11147'>Papel Picado Workshop Series: Session 5</a>
You can find those a
elements with
//a[starts-with(@href,'/institute/event/')]
But note that this returns a list of elements - whereas it seems that you expect one single item as the result. Please explain more clearly what exactly you need as the result.
As a suggestion, how about this:
from lxml import html
import requests
page = requests.get('http://web.international.ucla.edu/institute/events')
tree = html.fromstring(page.text)
event_titles = tree.xpath('//a[starts-with(@href,"/institute/event/")]/text()')
for event_title in event_titles:
print "Event Title: ", event_title
And the result will be
Event Title: Papel Picado Workshop Series: Session 5
Event Title: Cacahuatl: The Origins and Global Impact of Chocolate
Event Title: “Institutionalizing Numbers in Post-Colonial Africa”
Event Title: The Daniel Pearl Memorial Lecture presents A Conversation with Leon Panetta, part of the Luskin Lecture Series
Event Title: Persian Women and Other Lies: Story-telling as Historical Retrieval
Event Title: UCLA EVENT: Making Micronesia
Event Title: Teach-In: Out of Nowhere? Some Questions, Answers, and Discussion about ISIS
Event Title: Impossible Testimonies: Literature and Aesthetics in the Aftermath of the Armenian Genocide
Event Title: “Casa Grande” Film Screening
Event Title: The Headscarf Debates: Conflicts of National Belonging
Event Title: Rethinking History in Chinese Central Asia
Event Title: Screening: "REBEL: Loreta Velazquez, Civil War Soldier and Spy"
Event Title: "How Terrorism is Designed to Work"
Event Title: Matthäus Rest Talk - Dreaming of Pipes: The politics of in/visibility around Nepal’s spectral infrastructures
Event Title: The Barber of Damascus: Nouveau Literacy in the Eighteenth-Century Levant
Event Title: Representation of "Apology": a Comparative Study on Narratives by Korean and Japanese Media
Event Title: "They Can Live in the Desert but Nowhere Else": A History of the Armenian Genocide
Event Title: Colloquium: Towards a contents-platform conglomerate?
Event Title: Picturing Political Abstractions in Song/Jin Painting
Event Title: ISIS and the Enslavement and Trafficking of Women: An Evening with Dr. Khaled Abou El Fadi
Event Title: Korean Culture Night
Event Title: Genocide and Global History: A Conference on the 100th Anniversary of the Armenian Genocide
Event Title: U.S.-China: Economic Ties, Growth Strategies and Investment Opportunities
Event Title: Human Rights and the Armenian Genocide
Event Title: Gerschenkron Redux? New Evidence on Shanghai's Pre-War Stock Exchange and Its Implications for the Chinese Economy at Present