I am working in the configuration of cluster and the execution of a task in python.
My AMI version used is emr-4.7.0. With only spark selected as application to install.
Previous the task execution the bootstrap action was runned:
sudo pip install pymongo
sudo pip install py2neo
sudo pip install pymysql
sudo pip install pandas
someting to download the code from s3
After that, I connect to the master node using ssh, and executed my spark application (a python script). Then I got the following error:
ImportError: C extension: hashtable not built. If you want to import pandas from the source directory, you may need to run 'python setup.py build_ext --inplace' to build the C extensions first.
I have tried several things but no luck. It would be great if you can provide with any hint.
Additional info (edited): The python version used is 2.7, and it is the only version installed in the node. The pip is upgraded to the newest version:
sudo pip install --upgrade pip
Requirement already up-to-date: pip in /usr/local/lib/python2.7/site-packages
It is interesting, when I try to import pandas without any spark also have the problem. Using ipython I executed:
import pandas as pd
and the first time I got:
/usr/lib64/python2.7/locale.pyc in _parse_localename(localename)
473 elif code == 'C':
474 return None, None
--> 475 raise ValueError, 'unknown locale: %s' % localename
476
477 def _build_localename(localetuple):
ValueError: unknown locale: UTF-8
but when the same statement was executed I got:
ImportError Traceback (most recent call last)
<ipython-input-2-af55e7023913> in <module>()
----> 1 import pandas as pd
/usr/local/lib64/python2.7/site-packages/pandas/__init__.py in <module>()
29 "pandas from the source directory, you may need to run "
30 "'python setup.py build_ext --inplace' to build the C "
---> 31 "extensions first.".format(module))
32
33 from datetime import datetime
ImportError: C extension: hashtable not built. If you want to import pandas from the source directory, you may need to run 'python setup.py build_ext --inplace' to build the C extensions first.
I finally find out the solution. It was simpler than I expected. After the last update of information I realized the source of the error is the locale, and can be easily fixed with:
export LANG=es_ES.UTF-8
export LC_CTYPE=es_ES.UTF-8
export LC_ALL=es_ES.UTF-8