2

I want to execute multiple cql statements (2 or much more) using python driver.

I try such simple code but it lead to error if query contains more then one statement. I do not want to split statements or format it (to single statements). I just want to execute whole CQL.

How to do it with python driver for Cassandra - is it possible?

I use ; as statement split.

from cassandra.cluster import Cluster


def main():
    cluster = Cluster(contact_points=['cassandra-1.', 'cassandra-2.', 'cassandra-3.', 'cassandra-4.'])
    session = cluster.connect()
    session.execute('drop keyspace if exists test')
    session.execute('''
        create keyspace test
        with durable_writes = true
        and replication = {
            'class' : 'SimpleStrategy',
            'replication_factor' : 3
        };    
    ''')
    session.set_keyspace('test')

    # two statements or more and there is error
    # how to execute all in one call?
    query = '''\
    create table x1 (
        name text,
        
        primary key ((name))
    );

    create table x2 (
        name text,
        
        primary key ((name))
    );
    '''

    result_set = session.execute(query)
    print(result_set)


if __name__ == '__main__':
    main()

It generates such error:

Traceback (most recent call last):
  File "C:\Users\Cezary Wagner\PycharmProjects\medptr-v2\sandbox\cassandra_scheme\04_execute_multiple_statements.py", line 39, in <module>
    main()
  File "C:\Users\Cezary Wagner\PycharmProjects\medptr-v2\sandbox\cassandra_scheme\04_execute_multiple_statements.py", line 34, in main
    result_set = session.execute(query)
  File "cassandra\cluster.py", line 2618, in cassandra.cluster.Session.execute
  File "cassandra\cluster.py", line 4894, in cassandra.cluster.ResponseFuture.result
cassandra.protocol.SyntaxException: <Error from server: code=2000 [Syntax error in CQL query] message="line 7:4 mismatched input 'create' expecting EOF (... ((name))    );    [create]...)">
Chameleon
  • 9,722
  • 16
  • 65
  • 127

1 Answers1

1

Each statement should be separate. Just split text on ; using your Python code, and execute them separately. Also, please take into account that programmatic schema modifications like yours are potentially dangerous - you may get so-called schema disagreement that will require to fix it by rolling restart of the cluster.

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
  • I think about this solution with split. Why programmatic schema modification can lead to errors (is other way to do it if not by queries)? I assume that I will stop traffic for such changes for some minutes to stabilize schema. Is it not enough? – Chameleon May 17 '21 at 18:02
  • Thank you for "schema disagreement" I will study what is it. – Chameleon May 17 '21 at 18:36
  • 1
    By default, driver uses round robin load balancing policy. This means that different queries may land on different nodes, and there could be a chance that the second query may land on the node that isn’t received the notification about first query, and it will generate a new schema version that will conflict with version from another node. cqlsh solves that problem by using whitelist policy - it will send all queries to the same node. You also need to wait for schema agreement. – Alex Ott May 17 '21 at 18:36
  • 1
    Here is code example in Java: https://github.com/alexott/cassandra-dse-playground/blob/master/driver-1.x/src/main/java/com/datastax/alexott/demos/WhiteListPolicyExample.java#L37 – Alex Ott May 17 '21 at 18:37
  • That is greate example. So I should disable round robin for change of schema (if possible) and confirm change before and production traffic. Thank you for this good example. I often do not write in Java but code is very clear for me. I will check what Python driver do maybe the same and some policy should be modified to make clear schema change without nodes fixes. – Chameleon May 17 '21 at 18:55
  • Yes, it should be similar for Python – Alex Ott May 17 '21 at 19:08
  • I found this for Python. It looks that driver is simpler and they do it little differently ... consistency_level=ConsistencyLevel.ALL. It is not same but can do same because it force all nodes to be updated - no experience (maybe not work with DC). https://github.com/datastax/python-driver/blob/master/tests/integration/long/test_schema.py – Chameleon May 17 '21 at 19:50
  • They check rs.response_future.is_schema_agreed == True. – Chameleon May 17 '21 at 19:53
  • 1
    Consistency levels don’t work for schema modifications - it’s a different mechanism. Main thing - check for schema agreement – Alex Ott May 17 '21 at 19:59