4

We want to implement Great_Expectations' in DataBricks with Conditional Expectation. According to GE's documentation https://docs.greatexpectations.io/docs/reference/expectations/conditional_expectations is only available for Pandas this argument must be set to "pandas" by default, thus, demanding the appropriate syntax. Other engines might be implemented in the future.

Does anyone knows if it can be implemented in Spark 3.2.1 which integrated with Pandas APIs? if not, any suggestion for handle conditional expectation in DataBricks with Spark 3.2.1?

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
fullysane
  • 51
  • 1

1 Answers1

2

This functionality is now experimentally supported w/ Spark. The documentation is still being updated, but you should now be able to set a row_condition on an Expectation against a Spark datasource by passing great_expectations__experimental__ as the condition_parser.

GX_austin
  • 46
  • 4
  • 1
    How about SqlAlchemy engine? – Jahjajaka Jun 22 '22 at 14:49
  • 1
    Exactly the same, though you may experience unexpected behavior depending on dialect, and will need to explicitly declare columns. For example, a valid Postgresql row_condition would look like `'col("Age")>35'` – GX_austin Jul 01 '22 at 17:23