Questions tagged [apache-atlas]

Apache Atlas is a data governance and metadata framework for Hadoop. Use for questions about setting up Atlas, the REST APIs, bridges, or problems encountered using Atlas.

Data Governance and Metadata framework for Hadoop

Features

  • Data Classification

Import or define taxonomy business-oriented annotations for data Define, annotate, and automate capture of relationships between data sets and underlying elements including source, target, and derivation processes Export metadata to third-party systems

  • Centralized Auditing

Capture security access information for every application, process, and interaction with data Capture the operational information for execution, steps, and activities

  • Search & Lineage (Browse)

Pre-defined navigation paths to explore the data classification and audit information Text-based search features locates relevant data and audit event across Data Lake quickly and accurately Browse visualization of data set lineage allowing users to drill-down into operational, security, and provenance related information

  • Security & Policy Engine

Rationalize compliance policy at runtime based on data classification schemes, attributes and roles. Advanced definition of policies for preventing data derivation based on classification (i.e. re-identification) – Prohibitions Column and Row level masking based on cell values and attibutes.

References:

107 questions
6
votes
2 answers

CuratorFrameworkImpl - Background exception was not retry-able or retry gave up

Curator framework version - 4.3.0, Zookeeper version - 5.5.0 We have deployed apache atlas on Kubernetes and it uses Zookeeper to elect one out of two atlas pods as a leader. We are running three zookeeper pods (3 node cluster) and one pod going…
5
votes
3 answers

import metadata from RDBMS into Apache Atlas

I am learning Atlas and trying to find a way to import metadata from RDBMS like (Sql Server or Postgre Sql). Could somebody provide reference/s to do it or steps? I am using Atlas in docker with build in HBase and Solr. Intention is to import…
Irshad Ali
  • 1,153
  • 1
  • 13
  • 39
4
votes
2 answers

Simple example for adding relationships between Atlas entities?

What is the correct way to use the REST API to add a relationship between entities in apache atlas? Looking at the docs for the REST API, I find it difficult to tell what some of the fields mean, which are required or not (and what happens if they…
lampShadesDrifter
  • 3,925
  • 8
  • 40
  • 102
4
votes
2 answers

Running Apache Atlas standalone

I am trying to run Apache Atlas in a standalone fashion on Ubuntu - meaning without having to setup Solr and/or HBase. What I did (according to the documentation: http://atlas.apache.org/0.8.1/InstallationSteps.html) was cloning the Git repository,…
Daniel
  • 2,409
  • 2
  • 26
  • 42
3
votes
1 answer

Name relationship link between two different types in Apache Atlas

I am trying to name relationship link (by using attributeDefs) between two different types. The relationship is now registered in Atlas and definition fetch results as below: { "category": "RELATIONSHIP", "guid":…
s_mj
  • 530
  • 11
  • 28
3
votes
0 answers

Apache Atlas Production setup using Managed Cassandra and Apache Solr

I am looking for documentation or any other resource to setup Apache Atlas for Production using independent Solr and Cassandra instances. The goal is to have a persistent storage for the Apache Atlas.
3
votes
0 answers

How to manually / programatically add arbitrary data files to apache atlas?

Is there any way to add arbitrary data in HDFS to apache atlas? Having installed HDP 3.1 for evaluation, this appears to not be possible (eg. only data that is sqooped in, placed in a hive table, or some other narrow set of atlas-visible…
lampShadesDrifter
  • 3,925
  • 8
  • 40
  • 102
3
votes
0 answers

Apache Atlas wont start with separate ensemble of Zookeeper

I am setting up Apache Atlas, with an existing ensemble of Zookeeper, but using local HBase / Solr. However, HBase still tries to kick off its own ZK ensemble rather than use the existing one. I am trying to get Apache Atlas running with a separate…
Rafeski
  • 31
  • 4
3
votes
1 answer

How to install Apache Atlas 1.1.0 on MacOS Mojave?

I am trying to setup Apache Atlas on my system. I am encountering the following error I read the article as suggested (http://cwiki.apache.org/confluence/display/MAVEN/PluginConfigurationException) but I was unable to understand what it was trying…
3
votes
0 answers

apache-atlas installation on MS Azure HDInsight spark-cluster

I have installed Apache-atlas (embedded/non-prod) version on MS Azure HDInsight Spark cluster and it is functional. However, I am not able to start Apache-Atlas on production ready/multinode cluster. Has anybody done that? Is it a good idea to…
2
votes
0 answers

Create data lineage on yugabyte db thru apache atlas

No much resources are available online. But i wanted to create a data lineage system on data sourcing from yugabyte db thru Apache Atlas . Any pointers are appreciated . For e.g. Below is the process that i have [TABLE A] --python function--> [TABLE…
2
votes
0 answers

How to import MySQL metadata to Apache Atlas Without Storing Hive

How to import MySQL metadata to Apache Atlas using sqoop or Spark, need to show actual RDBMS data types and its column in Apache Atlas i tried MySQL to hive using sqoop MySQL to Hive using Spark I don't want to show Mysql to Hive, its only showing…
2
votes
1 answer

Apache Airflow and Apache Atlas Timeout

I am running Apache Airflow in AWS ECS and I am running Apache Atlas on EC2. I have been able to connect a local instance of Apache Airflow to Apache Atlas on EC2; however, I am not able to connect my AWS ECS instance and EC2 instance. I get the…
r123
  • 33
  • 1
  • 5
2
votes
0 answers

Apache Atlas: Http 503 Service Unavailable Error when connecting from Java Client

I am running the atlas docker image from my MacOS . The admin dashboard works fine and I can create/ manage entities from the dashboard. But when I try to run the sample Java app provided with the Atlas source code, I get "Http 503 Service…
mono1234
  • 21
  • 2
2
votes
1 answer

spark-atlas-connector: "SparkCatalogEventProcessor-thread" class not found exception

After following the instructions for spark-atlas-connector. I am getting below error while running simple code to create table in spark Spark2 2.3.1 Atlas 1.0.0 batch cmd is: spark-submit --jars…
1
2 3 4 5 6 7 8