My question is based upon this.
- Would it be possible more detailed comments/explain code starting
line
tf = HashingTF().transform( training_raw.map(lambda doc: doc["text"], preservesPartitioning=True))
- How could I print the confusion matrix?
What does below error mean? How can I fix it? The model still gets built and I get predictions
>>> # Train and check ... model = NaiveBayes.train(training) [Stage 2:=============================> (2 + 2) / 4]16/04/05 18:18:28 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS 16/04/05 18:18:28 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
How could I print results for the new observation. I tried and failed
>>> model.predict("love") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "c:\spark-1.6.1-bin-hadoop2.6\spark-1.6.1-bin-hadoop2.6\python\pyspark\mllib\classification.py", line 594, in predict x = _convert_to_vector(x) File "c:\spark-1.6.1-bin-hadoop2.6\spark-1.6.1-bin-hadoop2.6\python\pyspark\mllib\linalg\__init__.py", line 77, in _convert_to_vector raise TypeError("Cannot convert type %s into Vector" % type(l)) TypeError: Cannot convert type <class 'str'> into Vector