I'm wrapping up a library in requests using something along the lines of the following:
import requests
from lxml.html import fromstring
URL = "https://test"
COOKIES = {"test": "AAAAAAAAAAAAA"}
HEADERS = {"Connection": "close", "Upgrade-Insecure-Requests": "1", "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.89 Safari/537.36", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8", "Accept-Encoding": "gzip, deflate", "Accept-Language": "en-US,en;q=0.9"}
response = requests.get(URL, headers=HEADERS, cookies=COOKIES)
source = fromstring(response.content)
table = source.xpath("")
The response contains a lot of content and I'm trying to isolate the items in a table. The relevant part of the response is:
<table border="0" cellpadding="0" cellspacing="0" width="100%" class="dialogHdrTbl" summary="Layout table"><thead><tr align="left"><th class="groupHdr"><div class="groupHdr">View Client List</div></th></tr></thead><tbody><tr><td height="1"></td></tr></tbody></table><table width="100%" cellpadding="0" cellspacing="0" border="0" summary="Data table" class="dialogTbl"><tbody><tr class="altRwFlse"><td height="25" headers="hdr1" class="c1">TEST CLIENT 0</td><td height="25" headers="hdr2"><a class="dialogLnk" href="javascript:opener.document.contactForm.company.value="TEST CLIENT 1";self.close();" target="">Select</a></td></tr><tr class="altRwTre"><td height="25" headers="hdr1" class="c1">TEST CLIENT 2</td>
I'm trying to output:
TEST CLIENT 0 TEST CLIENT 1 TEST CLIENT 2
I've looked at using XPATH for this (based on this posting: How to parse text from a html table element) however I don't quite understand how to form my xpath query. What am I missing here?