1

how do i get following sessionkey with BeautifulSoup:

<a href=" https://website.com/login/logout.php?sesskey=Q3bAQgiGA2" class="dropdown-item menu-action" role="menuitem" data-title="logout" aria-labelledby="actionmenuaction-6">

The output should be: Q3bAQgiGA2

I tried following: sesskey = soup.find('a', attrs={'href':'sesskey'}).get('sesskey=')

randomguy
  • 63
  • 1
  • 6

1 Answers1

0

Use regex or simply split:

from bs4 import BeautifulSoup
session_key = soup.find('a').get('href') # in your case, find would do
print(session_key.split("sesskey=")[1])

Edit:

find()- It returns the result when the searched element is found in the doc.And the return type will be <class 'bs4.element.Tag'>.

find_all()- It returns all the matches (i.e) it scans the entire document and returns all the results and the return type will be <class 'bs4.element.ResultSet'>

So if you use find_all, you have to treat the result as a set.

  • hmm i am getting a error: raise AttributeError( AttributeError: ResultSet object has no attribute 'get'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()? – randomguy Mar 08 '21 at 16:18
  • 1
    Did i do something wrong? (URL edited) rs=session.get('https://website.com/user/edit.php?id=4250&course=1') soup=BeautifulSoup(rs.text, 'lxml') session_key=soup.find_all('a').get('href') print(session_key.split("sesskey=")[1]) – randomguy Mar 08 '21 at 16:20
  • If you just have only a single href in your parsed file, find would do. find_all will be needed in case you are parsing over multiple URLs and want sessions id's of them all. You will have to run a loop for the same. – FugitiveMemories Mar 08 '21 at 16:39
  • Read more [here](https://linuxhint.com/python-beautifulsoup-tutorial-for-beginners/#:~:text=The%20find%20method%20searches%20for,a%20list%20of%20type%20bs4.) – FugitiveMemories Mar 08 '21 at 16:43
  • For multiple urls, it would go like this: `for sess in soup.find_all('a'): print(sess.get('href').split("sesskey=")[1])` – FugitiveMemories Mar 08 '21 at 16:53
  • thank you but it still doesn't work. now i get this error: IndexError: list index out of range – randomguy Mar 08 '21 at 17:04
  • Can you share what are you trying to do? – FugitiveMemories Mar 08 '21 at 17:14
  • Please note that your question had `sesskey` variable and you comment has `id` variable. So you will have to parse accordingly. You are probably using sesskey instead of id which is causing the list index error. – FugitiveMemories Mar 08 '21 at 17:21