I am using Jupyter notebook in anaconda and try to use the pvclust to perform hierarchical clustering on my data. My codes:
from rpy2.robjects import r, pandas2ri
from rpy2.robjects.packages import importr
pandas2ri.activate()
base = importr("base")
pvclust = importr("pvclust")
But I got the error:
RRuntimeError Traceback (most recent call last)
<ipython-input-51-291b18105962> in <module>()
3 pandas2ri.activate()
4 base = importr("base")
----> 5 pvclust = importr("pvclust")
6 # data = robjects.DataFrame.from_csvfile(filepath + folders[0] + '\\vcfA_filled.csv')
7 # data
~\Anaconda3\lib\site-packages\rpy2-2.9.1-py3.6-win-amd64.egg\rpy2 \robjects\packages.py in importr(name, lib_loc, robject_translations, signature_translation, suppress_messages, on_conflict, symbol_r2python, symbol_check_after, data)
451 if _package_has_namespace(rname,
452 _system_file(package = rname)):
--> 453 env = _get_namespace(rname)
454 version = _get_namespace_version(rname)[0]
455 exported_names = set(_get_namespace_exports(rname))
RRuntimeError: Error in loadNamespace(name) : there is no package called 'pvclust'
It seems I need to install the pvclust first? But I am using jupyter notebook (python3.6) launched by anaconda and I am confused how to get a R package like this preinstalled and then import from rpy2?
P.S. Is there any Python package that can perform hierarchical clustering with p-value? All I need is to use some function that can bootstrap my data and cluster the data with p-values.
Thanks a lot.