I'm developing some code that runs on Databricks. Given that Databricks can't be run locally, I need to run unit tests on a Databricks cluster. Problem is when I install the wheel that contains my files, test files are never installed. How do I install the test files?
Ideally I would like to keep src
and tests
in separate folders.
Here is my project's (pyproject.toml
only) folder structure:
project
├── src
| ├── mylib
│ ├── functions.py
│ ├── __init__.py
├── pyproject.toml
├── poetry.lock
└── tests
├── conftest.py
└── test_functions.py
My pyproject.toml
:
[tool.poetry]
name = "mylib"
version = "0.1.0"
packages = [
{include = "mylib", from = "src"},
{include = "tests"}
]
[tool.poetry.dependencies]
python = "^3.8"
pytest = "^7.1.2"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"
Without {include = "tests"}
in pyproject.toml
, poetry build
doesn't include tests.
After poetry build
I can see that the tests are included in wheel produced (python3 -m wheel unpack <mywheel.whl>
). But after I deploy it as a library on a Databricks cluster, I do not see any tests folder (ls -r .../site-packages/mylib*
in a Databricks notebook shell cell) though functions.py
is installed.
I also tried moving tests
under src
and update toml to {include = "tests", from = "src"}
, but then the wheel file produced contains mylib
& tests
with appropriate files, but only mylib
gets installed on Databricks.
project
├── src
| ├── mylib
│ │ ├── functions.py
│ │ └── __init__.py
| └── tests
│ ├── conftest.py
│ └── test_functions.py
├── pyproject.toml
└── poetry.lock
As someone is trying to point to dbx
as teh solution, I've tried to use it. It doesn't work. It has a bunch of basic restrictions (e.g. must use ML runtime), which renders it useless, not to mention it expects that you use whatever toolset it recommends. Perhaps in a few years it would do what this post needs.