1

I am trying to save the results of a BeatifoulSoup iteration that extract/parse text from a Wikipedia URL to a text file. I have not been successful creating the text file and adding text while I am iterating on my loop to parse sentences. I would like to send the output of my code to a Text File. Printing to the screen works fine. Hope you can guide me here.

import requests
import string
from bs4 import BeautifulSoup

url_to_text = "https://en.wikipedia.org/wiki/Santiago"

url_open = requests.get(url_to_text)
soup = BeautifulSoup(url_open.content,'html.parser')

for i in range(1,50):
    doc_text = print((soup('p')[i].text))
Adarsh Wase
  • 1,727
  • 3
  • 12
  • 26
  • Does this answer your question? [Python Save to file](https://stackoverflow.com/questions/9536714/python-save-to-file) – drum Aug 09 '21 at 22:29

2 Answers2

0

How to write a file:

with open('text.txt', 'w') as file:
    file.write('text')

You can read this question to have more information on how to save a file in Python.

Implementation:

from requests import get
from bs4 import BeautifulSoup

soup = BeautifulSoup(
    get("https://en.wikipedia.org/wiki/Santiago").content, "html.parser"
)

# mode w = writing mode
with open(file="text.txt", mode="w",encoding="utf-8") as file:
    for line in range(1, 50):
        file.write(soup("p")[line].text)

I would like to add that it is not necessary for the file to exist prior to execution, Python will create it if it does not exist.

Eliaz Bobadilla
  • 479
  • 4
  • 16
0

Please try this,

with open(file="my_text.txt", mode="w", encoding="UTF-8") as dest_file:
  for i in range(1, 50):
    dest_file.write(soup('p')[i].text)

The problem is mainly due to encoding. By default Python uses UNICODE. Switching to UTF-8 would do the trick. Please feel free to reach out if issue still persists.

Thanks.