I have a weird error and I will try to simplify my problem. I have a simple function that scraps an url with beautiful soup and returns a list. Then, I pickle the list in file, so I setrecursionlimit(10000) to avoid RecursionError. Until there, everything is good.
But when I try to unpickle my list, I have this error:
Traceback (most recent call last):
File ".\scrap_index.py", line 86, in <module>
data_file = pickle.load(data)
TypeError: __new__() missing 1 required positional argument: 'name'
There is my function:
import urllib.request
from bs4 import BeautifulSoup
def scrap_function(url):
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page, "html5lib")
return [soup]
For testing, I've tried different url. With that url, everything is good:
url_ok = 'https://www.boursorama.com/bourse/'
But with that one, I have the TypeError:
url_not_ok = 'https://www.boursorama.com/bourse/actions'
And the test code:
import pickle
import sys
sys.setrecursionlimit(10000)
scrap_list = scrap_function(url_not_ok)
with open('test_saving.pkl', 'wb') as data:
pickle.dump(scrap_list, data, protocol=2)
with open('test_saving.pkl', 'rb') as data:
data_file = pickle.load(data)
print(data_file)