4

This is about Great Expectations module in python primarily used for data quality checks (I found their documentation to be inadequate). So I've been trying to set up the data context on my notebook (using a local datasource) - as mentioned in:

https://docs.greatexpectations.io/en/latest/guides/how_to_guides/configuring_data_contexts/how_to_instantiate_a_data_context_without_a_yml_file.html#how-to-guides-configuring-data-contexts-how-to-instantiate-a-data-context-without-a-yml-file

Following is my code :

from great_expectations.data_context.types.base import DataContextConfig
from great_expectations.data_context.types.base import DatasourceConfig
from great_expectations.data_context.types.base import FilesystemStoreBackendDefaults
from great_expectations.data_context import BaseDataContext

data_context_config = DataContextConfig(
    datasources={
        "debaprc_test": DatasourceConfig(
            class_name="PandasDatasource",
            batch_kwargs_generators={
                "subdir_reader": {
                    "class_name": "SubdirReaderBatchKwargsGenerator",
                    "base_directory": "/Users/debaprc/Downloads"              
                }
            },
        )
    },
    store_backend_defaults=FilesystemStoreBackendDefaults(root_directory="/Users/debaprc/GE_Test/New/")
)

context = BaseDataContext(project_config=data_context_config)

And this is the error I get:

base_directory must be an absolute path if root_directory is not provided

What am I doing wrong?

Miguel Trejo
  • 5,913
  • 5
  • 24
  • 49

1 Answers1

3

Thank you so much for using Great Expectations. That is a known issue with our latest upgrade of the Checkpoints feature, which was fixed on our develop branch. Please install from the develop branch or wait until our next release 0.13.9 coming this week.

aburdi
  • 31
  • 3
  • Thanks a lot. I had another doubt regarding running GE purely from a notebook. I noticed there is no documentation about how to have an S3 datasource (in the link mentioned above). Is that also something that will be added in a later release? Or if it is already a feature, could you please share a link or something? – Debapratim Chakraborty Feb 09 '21 at 04:27
  • Hi @DebapratimChakraborty! Check out [How to configure a Pandas/S3 Datasource](https://docs.greatexpectations.io/en/latest/guides/how_to_guides/configuring_datasources/how_to_configure_a_pandas_s3_datasource.html) and our `Configuring Datasources` section of the How-to Guides for more info on creating Datasources including on S3. If you have a filesystem you can use the yml configuration, if not you may have to translate the yml in our docs to python - there are a few examples in the link you mentioned. – aburdi Feb 09 '21 at 22:39
  • Also you may wish to check out our website which has a link to our slack community which has a channel devoted to support: https://greatexpectations.io/ – aburdi Feb 09 '21 at 22:41