0

How can we extract value by using xpth or css selector if the attribute is dynamically changed for example:

<p data-reactid=".2e46q6vkxnc.1.$0">
    <b data-reactid=".2e46q6vkxnc.1.$0.0">Mark Obtain</b>
    <i class="avu-full-width" data-reactid=".2e46q6vkxnc.1.$0.1">
        <span data-reactid=".2e46q6vkxnc.1.$0.1.0"> </span>
        <span data-reactid=".2e46q6vkxnc.1.$0.1.1">450 A+.</span>
    </i>
</p>

<p data-reactid=".2e46q6vkxnc.1.$1">
    <b data-reactid=".2e46q6vkxnc.1.$1.0">Student Name</b>
    <i class="avu-full-width" data-reactid=".2e46q6vkxnc.1.$1.1">
        <span data-reactid=".2e46q6vkxnc.1.$0.1.0"> </span>
        <span data-reactid=".2e46q6vkxnc.1.$0.1.1">First Name</span>
    </i>
</p>

In this case attribute of element is dynamically changing but "Mark Obtain" and "Student Name" will always be same, so is there any way or can we write if condition or some regex along with xpath expression to get "450 A+" and "First Name" values. Please help

Pradeep Mishra
  • 137
  • 2
  • 12

1 Answers1

0

To get required values you can use below XPath expressions:

//p[b="Mark Obtain"]//span[2]/text()

to get "450 A+."

and

//p[b="Student Name"]//span[2]/text()

to get "First Name"

Andersson
  • 51,635
  • 17
  • 77
  • 129
  • Thank a lot :) how can we store the value of this tagहिन्दी that is "हिन्दी" I tried to extract but getting \u0939\u093f\u0928\u094d\u0926\u0940 unicode characters. please help – Pradeep Mishra Aug 16 '17 at 19:51
  • Did you try to add prefix `u` as `print(u"\u0939\u093f\u0928\u094d\u0926\u0940")`? – Andersson Aug 16 '17 at 19:58
  • Great! Thanks a ton :-) – Pradeep Mishra Aug 16 '17 at 20:04
  • Actually the result of item['Languages'] = response.xpath('//p[span[@data-automation-id="meta-info-languages"]]/b/text()').extract() expression was \u0939\u093f\u0928\u094d\u0926\u0940, so where to add u" as a prefix in this expression. Please help – Pradeep Mishra Aug 16 '17 at 20:12
  • Before the string: u"string" – Andersson Aug 17 '17 at 04:36
  • I tried like string = response.xpath('//p[span[@data-automation-id="meta-info-lang‌​uages"]]/b/text()').‌​extract() u"string" but then I get u'string' – Pradeep Mishra Aug 17 '17 at 06:53
  • tried but got \xe0\xa4\xb9\xe0\xa4\xbf\xe0\xa4\xa8\xe0\xa5\x8d\xe0\xa4\xa6\xe0\xa5\x80, this type of encoding. – Pradeep Mishra Aug 17 '17 at 07:20
  • Wait. When you just `print(string)` you get desired output? It seem that you don't need decode/encode... – Andersson Aug 17 '17 at 07:27
  • No, when i just go with print(string) then i get \u0939\u093f\u0928\u094d\u0926\u0940, which is not required. desired output should be हिन्दी In IDE when i just use u"\u0939\u093f\u0928\u094d\u0926\u0940" then i get correct output but i do not to get desired output when i store the result in some variable and then use u"variable" – Pradeep Mishra Aug 17 '17 at 08:01
  • Hm... weird. Check [this ticket](https://stackoverflow.com/questions/20203265/python-string-encoding-for-a-variable). I guess it should help you better – Andersson Aug 17 '17 at 08:22