You can use splash as last option, it will cause that your spider be more expensive and complex.
Luckily, in your case you can use one of the <script>
tags to get the required data.
First you need to get the correct <script>
tag:
ans = response.xpath("//script[contains(text(),'telephone')]/text()").extract_first()
It gives you a json
like this:
{
"@context": "http://schema.org",
"@type": "Person",
"name": "Cynthia Hóss Rocha",
"description": "advogada há 15 anos.",
"telephone": "(11) 985282712",
"image": "imgs.jusbr.com/profiles/5368773/images/1419878998_standard.jpg",
"jobTitle": "Advogado",
"url": "https://cynthiahossrocha.jusbrasil.com.br",
"address": {
"@type": "PostalAddress",
"addressLocality": "São Paulo (SP)",
"streetAddress": "Rua Marconi, 131",
"postalCode": "01047-000"
}
}
To convert it into an object you need to import json
and use json.loads
:
json_ans = json.loads(ans)
Finally you only need to extract the required value:
phone = json_ans["telephone"]