0

I am basically in the same situation as this question Getting attribute using XPath and the solution in that question works for me in Chrome and Firefox BUT I have to use Safari (for a couple of different reasons, mainly AppleScript) and using /@ in Safari returns an empty result.

Are there alternative ways to extract xpath attributes?

They/it must be compatible with document.evaluate. (I have a solution that works with $x but it doesn't work with document.evaluate, adding .map(link => link.href) at the end of the xpath).


Update: it seems I am partially wrong - @/ does indeed work in all cases I have tried but mine.

More details to reproduce my problem: On https://www.facebook.com/zuck/followers I want to extract the link to all the followers. The element I target looks like this:

<a class="x1i10hfl xjbqb8w x6umtig x1b1mbwd xaqea5y xav7gou x9f619 x1ypdohk xt0psk2 xe8uvvx xdj266r x11i5rnm xat24cr x1mh8g0r xexx8yu x4uap5 x18d9i69 xkhd6sd x16tdsg8 x1hl2dhg xggy1nq x1a2a7pz x1heor9g xt0b8zv" href="https://www.facebook.com/profile.php?id=100065103331942" role="link" tabindex="0">

A simple xpath (I will optimise it later) that points to this element:

//div/div[1]/div/div[3]/div/div/div/div[1]/div[1]/div/div/div[4]/div/div/div/div/div/div/div/div/div[3]/div/div[2]/div[1]/a

As you can see the a has four attributes, class, href, role and tabindex.

If I add /@class at the end of my xpath it evaluates to

class="x1i10hfl xjbqb8w x6umtig x1b1mbwd xaqea5y xav7gou x9f619 x1ypdohk xt0psk2 xe8uvvx xdj266r x11i5rnm xat24cr x1mh8g0r xexx8yu x4uap5 x18d9i69 xkhd6sd x16tdsg8 x1hl2dhg xggy1nq x1a2a7pz x1heor9g xt0b8zv"

just as expected. Adding role or tabindex as the end also gives the expected result but if I try with href instead of the other three attributes, I just get href (the actual string) as result. Or, it is even weirder and hard to describe in just text:

Collapsed result Collapsed result

Expanded result enter image description here

I interpret this as Safari is treating the string href differently from other XML attributes? Can this really be true? Workarounds?

d-b
  • 695
  • 3
  • 14
  • 43

1 Answers1

1

Select the link elements with document.evaluate, then extract the href DOM properties as shown below:

var links = []; var xpathResult = document.evaluate('//div/div[1]/div/div[3]/div/div/div/div[1]/div[1]/div/div/div[4]/div/div/div/div/div/div/div/div/div[3]/div/div[2]/div[1]/a', document, null); var link = null; while ((link = xpathResult.iterateNext()) != null) { links.push(link.href); } ; links
Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • 1
    Any suggestion as to why this workaround is necessary? – Michael Kay Apr 02 '23 at 14:13
  • 1
    @MichaelKay, I haven't been able to tell from d-b's questions why sometimes `$x` is used and later it has to be `document.evaluate`; I only know that `$x` returns an array so you can use e.g. `.map` on it while it doesn't work with `document.evaluate`. And I am afraid those screenshots of browser console output don't allow me to tell whether Safari has some quirk accessing the `href` or any attribute using `$x` and/or `document.evaluate`. It appears something is odd or quirky in that context but I have not tried to research or even verify what it is exactly. – Martin Honnen Apr 02 '23 at 14:39
  • @MartinHonnen The background is that I controlling Safari using AppleScript, which requires document.evaluate. Yes, it does indeed seem to be a quirk with Safari and href. Another question: what does the single `links` at the end of your expression do? Just curious, I suck at JS... – d-b Apr 02 '23 at 15:40
  • @b-t, from what I assumed the evaluation of JavaScript in Safari needs some result returned by the final statement and that is just my attempt to have the `links` array declared at the beginning with `var links` and initialized with `var links = [];`, then filled by the `links.push(link.href)` in the while loop to be returned as the final result of the JavaScript evaluation. I would think that a sole attempt to finish the sequence of statements with the while loop would not show you any result, unless that AppleScript evaluation exposes JavaScript variables like `links` automatically. – Martin Honnen Apr 02 '23 at 15:50