1

I'd like to use xquery (I believe) to output the text from the title attribute of an html element.

Example:

<div class="rating" title="1.0 stars">...</div>

I can use xpath to select the element, but it tries to output the info between the div tags. I think I need to use xquery to output the "1.0 stars" text from the title attribute.

There's gotta be a way to do this. My Google skills are proving ineffective in coming up with an answer.

Thanks.

Nick J.
  • 71
  • 2
  • 8

2 Answers2

2

XPath: //div[@class='rating']/@title

This will give you the title text for every div with a class of "rating".

Addendum (following from comments below):

If the class has other, additional text in it, in addition to "rating", then you can use something like this:

//div[contains(concat(' ', normalize-space(@class), ' '), ' rating ')]

(Hat tip to How can I match on an attribute that contains a certain string?).

Community
  • 1
  • 1
jwismar
  • 12,164
  • 3
  • 32
  • 44
  • This seems to work on certain sites. But not on the site I need to pull data from. For instance, if you go to this site: https://play.google.com/store/apps/details?id=com.mfoundry.mb.android.mb_642 and then try to pull the title text from the divs with class 'ratings', it doesn't work. Any ideas? – Nick J. Jun 10 '13 at 17:14
  • That doesn't appear to be an XML page, as far as I can tell. Its DOCTYPE is listed as HTML, and it doesn't appear to specify that it's XHTML. You may not be able to use an XML parser to query this. – jwismar Jun 10 '13 at 21:28
  • The strange thing is I can pull other data from this page no problem. For instance, the date for each review works with this xpath: – Nick J. Jun 10 '13 at 21:36
  • Are you trying to find, for example, this div? `
    ` Note that the class in this case is `"ratings goog-inline-block"`, not `"ratings"`. See, e.g., http://stackoverflow.com/questions/1390568/xpath-how-to-match-attributes-that-contain-a-certain-string
    – jwismar Jun 10 '13 at 21:41
  • The strange thing is I can pull other data from this page no problem. For instance, the date for each review works with this xpath:`//span[@class='doc-review-date']` and the comments from each review: `//div[@class='doc-user-reviews-list']//p[@class='review-text']`. But when I try to pull the title attr with the rating for the review, nothing is pulled. If only Google built this page with 'content first' method that's all the rage these days. – Nick J. Jun 10 '13 at 21:42
  • This is the xpath I just tried to use: `//div[@class='ratings goog-inline-block']/@title` – Nick J. Jun 10 '13 at 21:45
  • If it helps any, I'm trying to do this in a Google Drive spreadsheet using the importxml function. All my other xpath imports work fine. – Nick J. Jun 10 '13 at 21:54
-1

You should use:

let $XML := <p><div class="rating" title="2.0 stars">sdfd</div><div class="rating" title="1.0 stars">sdfd</div></p>
for $title in $XML//@title
return
  <p>{data($title)}</p>

to get output:

<p>2.0 stars</p>
<p>1.0 stars</p>
Navin Rawat
  • 3,208
  • 1
  • 19
  • 31