3

I'm trying to setup Apache Superset for Clickhouse. My understanding so far is that I need to install SQLAlchemy for Clickhouse https://github.com/xzkostyan/clickhouse-sqlalchemy

I'm in Ubuntu 16.04 LTS, and using the Docker vanilla version of Clickhouse and of Superset:

without special settings

Any idea how I can bridge the two docker containers with clickhouse-sqlalchemy ? Where and how in that case to install that? (if you have sample command line that I can reuse that will be great)

TylerH
  • 20,799
  • 66
  • 75
  • 101
Stephane
  • 161
  • 2
  • 14

2 Answers2

3

You don't need to bridge them: what you want is a superset server (that you happen to be running via docker) to connect to a clickhouse database (that you also happen to be running via docker).

You also shouldn't need to install SQLAlchemy for Clickhouse: looking at the dockerfile at https://hub.docker.com/r/amancevice/superset/~/dockerfile/ that image has already sqlalchemy-clickhouse installed for you.

Your steps should be as follow:

  • When you docker run --detach --name superset [options] amancevice/superset you should have your superset instance running at http://localhost:8088/

  • Similarly, when you run $ docker run -d --name some-clickhouse-server --ulimit nofile=262144:262144 -v /path/to/your/config.xml:/etc/clickhouse-server/config.xml yandex/clickhouse-server you should end-up with a clickhouse instance that you can access via SQLAlchemy at something like clickhouse://default:@some-clickhouse-server/test You'd need to modify that connection URI based on your config.xml - and you should be able to double-check that it works by connecting to it in your python console.

  • You should then be able to connect superset to your clickhouse db in the same way you'd connect to any other DB: by navigating into Superset's menu > Sources > Databases > [new]

David Tobiano
  • 1,188
  • 8
  • 10
  • Hi, it work as you described (just for super set I had to check the IP with "docker inspect superset" and use that for the web access host name. Thank You ! – Stephane Dec 01 '17 at 09:31
0

Consider using already prepared and configured docker-compose.yml which included in Apache Superset (see https://github.com/apache/superset/blob/master/docker-compose.yml).

To work with Clickhouse should be installed sqlalchemy driver. There are two ones:

I recommend using clickhouse-sqlalchemy because it is actually supported and evolute, it supports both available protocols to interact with ClickHouse - HTTP and TCP (native protocol).


Let's connect to one of the public ClickHouse:

  • either Demo Yandex CH
docker run -it --rm yandex/clickhouse-client:latest \
    --host gh-api.clickhouse.tech --user explorer -s
docker run -it --rm yandex/clickhouse-client:latest \
    --host github.demo.trial.altinity.cloud -s --user demo --password demo

  1. download source code from repo https://github.com/apache/superset

  2. execute the commands

cd superset-master

docker-compose up

# open the new terminal

docker-compose exec superset bash /app/docker/docker-init.sh
docker-compose exec superset pip install clickhouse-sqlalchemy
docker-compose restart
  1. wait for containers to be started and the web app to be built (see the console output, webpack should finish its work)

  2. browse URL http://localhost:8088 (use credentials admin / admin)

  3. add the database using one of the connection string:

# connection string for Demo Yandex ClickHouse
clickhouse+native://explorer@gh-api.clickhouse.tech/default?secure=true

# connection string for Demo Altinity.Cloud CH
clickhouse+native://demo:demo@github.demo.trial.altinity.cloud/default?secure=true

See also https://stackoverflow.com/a/66006784/303298.

vladimir
  • 13,428
  • 2
  • 44
  • 70
  • In the connection string, it is possible to pass alt_hosts as query params. Like this clickhouse+native://username:password@host:port/default?alt_hosts=host1:port1,host2:port2. The following thing works fine for clickhouse://username:password@host:port/default, but I am getting an error will using clickhouse+native in the superset version 1.0.1 and unable to pass params. – ANIL PATEL Feb 17 '21 at 10:04
  • @ANILPATEL you can't do that - [clickhouse-sqlalchemy](https://github.com/xzkostyan/clickhouse-sqlalchemy) doesn't support this feature. You can ask for implement this feature here [feature request](https://github.com/xzkostyan/clickhouse-sqlalchemy/issues) or use [CH load balancer](https://clickhouse.tech/docs/en/interfaces/third-party/proxy/#proxy-servers-from-third-party-developers). – vladimir Feb 18 '21 at 06:17