Apache Atlas is a data governance and metadata framework for Hadoop. Use for questions about setting up Atlas, the REST APIs, bridges, or problems encountered using Atlas.
Data Governance and Metadata framework for Hadoop
Features
- Data Classification
Import or define taxonomy business-oriented annotations for data Define, annotate, and automate capture of relationships between data sets and underlying elements including source, target, and derivation processes Export metadata to third-party systems
- Centralized Auditing
Capture security access information for every application, process, and interaction with data Capture the operational information for execution, steps, and activities
- Search & Lineage (Browse)
Pre-defined navigation paths to explore the data classification and audit information Text-based search features locates relevant data and audit event across Data Lake quickly and accurately Browse visualization of data set lineage allowing users to drill-down into operational, security, and provenance related information
- Security & Policy Engine
Rationalize compliance policy at runtime based on data classification schemes, attributes and roles. Advanced definition of policies for preventing data derivation based on classification (i.e. re-identification) – Prohibitions Column and Row level masking based on cell values and attibutes.