-3

With Python I'm having issues turning web scrapped links into strings so I can save them as either a txt or csv file. I would really like them as a txt file. This is what I have at the moment.

import requests
from bs4 import BeautifulSoup

url = "https://www.google.com/"
reqs = requests.get(url)
soup = BeautifulSoup(reqs.text, 'html.parser')

urls = []
for link in soup.find_all('a'):
    print(link.get('href'))
type(link)

print(link, file=open('example.txt','w'))

I've tried all sort of things with no luck. I'm pretty much at a lose.

D B
  • 1
  • 1
  • What doesn't work? Where are you stuck? – 0stone0 Nov 21 '22 at 15:41
  • In Jupyter Notebook it outputs, but I cannot seem to get it to save as a text file (or csv) showing each output line. – D B Nov 21 '22 at 15:42
  • I can only get the last "link" to print to a text file. I'm stuck trying to get all the "links" to be listed in the text document. – D B Nov 21 '22 at 15:55
  • Show us your code regarding writing to the file, we can't see what you're doing wrong without the actual code – 0stone0 Nov 21 '22 at 15:57
  • Everything is at the top of the page. The last line printing to the example.txt only shows the last output line found when scrapping the URL google.com. So the output is a text doc with Terms – D B Nov 21 '22 at 16:10

1 Answers1

0
print(link, file=open('example.txt','w'))

Will write the link variable, but that's only the last one.


To write them all, use:

import requests
from bs4 import BeautifulSoup

url = "https://www.google.com/"
reqs = requests.get(url)
soup = BeautifulSoup(reqs.text, 'html.parser')

with open("example.txt", "w") as file:
    for link in soup.find_all('a'):
        file.write(link.get('href') + '\n')

Which uses a context manager to open the file, then write each href with a newline.

0stone0
  • 34,288
  • 4
  • 39
  • 64