I've written a script in scrapy to fetch the answers of different questions from a webpage. The problem is the answers are outside the elements I'm currently targeting. I know I could grab them using .next_sibling
if I used for BeautifulSoup
but in case of scrapy I can't find any idea.
Html elements are like:
<p>
<b>
<span class="blue">
Q:1-The NIST Information Security and Privacy Advisory Board (ISPAB) paper "Perspectives on Cloud Computing and Standards" specifies potential advantages and disdvantages of virtualization. Which of the following disadvantages does it include?
</span>
<br/>
Mark one answer:
</b>
<br/>
<input name="quest1" type="checkbox" value="1"/>
It initiates the risk that malicious software is targeting the VM environment.
<br/>
<input name="quest1" type="checkbox" value="2"/>
It increases overall security risk shared resources.
<br/>
<input name="quest1" type="checkbox" value="3"/>
It creates the possibility that remote attestation may not work.
<br/>
<input name="quest1" type="checkbox" value="4"/>
All of the above
</p>
I've tried so far with:
import requests
from scrapy import Selector
url = "https://www.test-questions.com/csslp-exam-questions-01.php"
res = requests.get(url,headers={"User-Agent":"Mozilla/5.0"})
sel = Selector(res)
for item in sel.css("[name^='quest']::text").getall():
print(item)
The above script prints nothing when exected, It throws no error either.
One of the expected output from above pasted html elements is:
It initiates the risk that malicious software is targeting the VM environment.
I'm only after any css selector solution.
How can I grab the answers of different question from that site?