0

I am attempting to download many dot-bracket notations of RNA sequences from a url link with Python.

This is one of the links I am using: https://rnacentral.org/rna/URS00003F07BD/9606. To navigate to what I want, you have to click on the '2D structure' button, and only then does the thing I am looking for (right below the occurence of this tag)

<h4>Dot-bracket notation</h4> 

appear in the Inspect Element tab.

When I use the get function from the requests package, the text and content fields do not contain that tag. Does anyone know how I can get the bracket notation item?

Here is my current code:

import requests
url = 'http://rnacentral.org/rna/URS00003F07BD/9606'
response = requests.get(url)
print(response.text)
Alan
  • 389
  • 3
  • 16

1 Answers1

2

Requests library does not render JS. You need to use a web browser-based solution like selenium. I have listed a pseudo-code below.

  1. Use selenium to load the page.
  2. then click the button 2D structure using selenium.
  3. Wait for some time by adding a time.sleep().
  4. And read the page source using selenium.

You should get what you want.

Saurav Panda
  • 558
  • 5
  • 12