I am writing a set of unit tests for some library that depends and uses PySpark. For that, I am using something like this:
import unittest
import pyspark
from pyspark.sql.types import Row
class PySparkTestCase(unittest.TestCase):
def setUp(cls):
conf = pyspark.SparkConf().setMaster('local[1]').setAppName("testing")
cls.spark = pyspark.SQLContext(pyspark.SparkContext(conf=conf))
def tearDown(cls):
cls.spark._sc.stop()
def test_basic(self):
instance = self.spark.createDataFrame(data=[Row(a=1.0, b='s'), Row(a=1.0, b='s')])
self.assertEqual(instance.count(), 2)
and executing (Python 3.7.0
, pyspark==2.3.1
), as
python -m unittest example
The test passes but with one log message and 2 warnings:
2018-09-08 12:14:15 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/subprocess.py:839: ResourceWarning: subprocess 13349 is still running ResourceWarning, source=self)
/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/importlib/_bootstrap.py:219: ImportWarning: can't resolve package from
__spec__
or__package__
, falling back on__name__
and__path__
return f(*args, **kwds).
Ran 1 test in 16.361s
OK
- Should I be worried about any of this?
- If not, how do I suppress this text from showing up?