I need to install pyspark
. It has a dependency on pypandoc
. So I first do pip install pypandoc
and then pip install pyspark
and everything looks fine. However, based on some requirements I need to install my dependencies using a requirements.txt
file. So I put both pypandoc
and pyspark
in the requirements.txt
file and then I do pip install -r requirements.txt
(pypandoc comes first in the file followed by pyspark), however this time the installation file with the following error
Complete output from command python setup.py egg_info:
Could not import pypandoc - required to package PySpark
Download error on https://pypi.org/simple/pypandoc/: [Errno 97] Address family not supported by protocol -- Some packages may not be found!
Couldn't find index page for 'pypandoc' (maybe misspelled?)
Download error on https://pypi.org/simple/: [Errno 97] Address family not supported by protocol -- Some packages may not be found!
No local packages or working download links found for pypandoc
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-install-6vmbjchu/pyspark/setup.py", line 224, in <module>
'Programming Language :: Python :: Implementation :: PyPy']
File "/usr/local/lib/python3.6/site-packages/setuptools/__init__.py", line 144, in setup
_install_setup_requires(attrs)
File "/usr/local/lib/python3.6/site-packages/setuptools/__init__.py", line 139, in _install_setup_requires
dist.fetch_build_eggs(dist.setup_requires)
File "/usr/local/lib/python3.6/site-packages/setuptools/dist.py", line 724, in fetch_build_eggs
replace_conflicting=True,
File "/usr/local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 782, in resolve
replace_conflicting=replace_conflicting
File "/usr/local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 1065, in best_match
return self.obtain(req, installer)
File "/usr/local/lib/python3.6/site-packages/pkg_resources/__init__.py", line 1077, in obtain
return installer(requirement)
File "/usr/local/lib/python3.6/site-packages/setuptools/dist.py", line 791, in fetch_build_egg
return cmd.easy_install(req)
File "/usr/local/lib/python3.6/site-packages/setuptools/command/easy_install.py", line 673, in easy_install
raise DistutilsError(msg)
distutils.errors.DistutilsError: Could not find suitable distribution for Requirement.parse('pypandoc')
So it looks like when I do it in this way, the pypandoc
is not properly installed when it tries to install pyspark
. How can I fix this issue?