2

I want to download the gensim glove-wiki-gigaword-100 dataset. Here's my code

import gensim.downloader as api
model = api.load("glove-wiki-gigaword-100")

But I'm receiving this error

ValueError: unable to read local cache '/Users/xxx/gensim-data/information.json' during fallback, connect to the Internet and retry

I checked my terminal for the gensim version and got this so I think it's installed

pip3 show gensim
Name: gensim
Version: 3.8.3
Summary: Python framework for fast Vector Space Modelling
Home-page: http://radimrehurek.com/gensim
Author: Radim Rehurek
Author-email: me@radimrehurek.com
License: LGPLv2.1
Location: /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages
Requires: smart-open, scipy, numpy, six
Required-by: 

I can't figure this out. I turned my laptop on and off and reset the router. I dont think this issue is related to my internet despite the error?

llamaro25
  • 642
  • 1
  • 7
  • 22

3 Answers3

1

I might be wrong but I think it's because gensim does not support python 3.8. I downgraded to 3.6 and the problem is fixed

llamaro25
  • 642
  • 1
  • 7
  • 22
  • 2
    Gensim aims to support Python 3.8, so if you've found an error unique to Python 3.8, you could file a bug report. (It'd be especially important to include the full error stack around the `ValueError`, to show which lines of code involved in the error.) More generally, I recommend **against** using `gensim.api` functionality, as it downloads not just data but also extra Python code that's outside of normal version-control & PyPI-packaging, & so involves extra risks of getting opaque/malicious code. Any dataset available via that `api` should also be findable for plain web download elsewhere. – gojomo Oct 19 '20 at 16:53
0

Please note that if you want to download this model you can download using Pycharm

import gensim.downloader as api
model = api.load("glove-wiki-gigaword-100")

but the genism is not working for Python3.8. So you can downgrade to another version of Python like 3.4,5,6
As I have check the model is download but genism is not working.

Sven Eberth
  • 3,057
  • 12
  • 24
  • 29
Alex
  • 1
  • 7
0

I hit the same issue. Worked on 3.6 not on 3.8 after an OS upgrade. One problem is that this snippet of code as is does not provide any helpful debug information beyond the ValueError. Let's fix that.

Python 3.8.10 (v3.8.10:3d8993a744, May  3 2021, 08:55:58) 
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import logging
>>> logging.basicConfig()
>>> import gensim.downloader as api
>>> model = api.load("glove-wiki-gigaword-100")

Adding the above reveals that it is actually an SSL error.

ERROR:gensim.downloader:caught non-fatal exception while trying to update gensim-data cache from 'https://raw.githubusercontent.com/RaRe-Technologies/gensim-data/master/list.json'; using local cache at '/Users/marbron/gensim-data/information.json' instead
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 1354, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1252, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1298, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1247, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1007, in _send_output
    self.send(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 947, in send
    self.connect()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1421, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1040, in _create
    self.do_handshake()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/marbron/Projects/hcm-matchfox-testing/.py38/lib/python3.8/site-packages/gensim/downloader.py", line 199, in _load_info
    info_bytes = urlopen(url).read()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 1397, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/urllib/request.py", line 1357, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)>
Traceback (most recent call last):
  File "/Users/marbron/Projects/hcm-matchfox-testing/.py38/lib/python3.8/site-packages/gensim/downloader.py", line 219, in _load_info
    with io.open(cache_path, 'r', encoding=encoding) as fin:
FileNotFoundError: [Errno 2] No such file or directory: '/Users/marbron/gensim-data/information.json'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/marbron/Projects/hcm-matchfox-testing/.py38/lib/python3.8/site-packages/gensim/downloader.py", line 490, in load
    file_name = _get_filename(name)
  File "/Users/marbron/Projects/hcm-matchfox-testing/.py38/lib/python3.8/site-packages/gensim/downloader.py", line 426, in _get_filename
    information = info()
  File "/Users/marbron/Projects/hcm-matchfox-testing/.py38/lib/python3.8/site-packages/gensim/downloader.py", line 268, in info
    information = _load_info()
  File "/Users/marbron/Projects/hcm-matchfox-testing/.py38/lib/python3.8/site-packages/gensim/downloader.py", line 222, in _load_info
    raise ValueError(
ValueError: unable to read local cache '/Users/marbron/gensim-data/information.json' during fallback, connect to the Internet and retry

There are several good answers on SO on how to solve the above:

But in short:

  • pip install -U certifi
  • /Applications/Python 3.X/Install Certificates.command

Then execute the snippet above again and the model will be downloaded.

m. bron
  • 518
  • 1
  • 9
  • 11