Koalas is officially included in PySpark as **pandas API on Spark** in Apache Spark 3.2. In Spark 3.2+, you no longer need to import koalas, as it comes with pyspark. The only required action is to add pandas and pyarrow as these are required dependencies that Code Repositories don't include by default. You can do so via Libraries tab.

You can confirm that it works using this test transform
@transform_df(
Output("OUTPUT_DATASET_PATH"),
)
def compute():
import pyspark.pandas as ps
psdf = ps.DataFrame(
{'a': [1, 2, 3, 4, 5, 6],
'b': [100, 200, 300, 400, 500, 600],
'c': ["one", "two", "three", "four", "five", "six"]},
index=[10, 20, 30, 40, 50, 60])
return psdf.to_spark()
To confirm that you are using Spark 3.2+ in your Code repository, please merge any pending upgrade PRs. Prior to Spark 3.2, it was possible to import koalas through Libraries tab