I have various CSV files incoming several times per day, storing timeseries data from sensors, which are parts of sensors stations. Each CSV is named after the sensor station and sensor id from which it is coming from, for instance "station1_sensor2.csv". At the moment, data is stored like this :
> cat station1_sensor2.csv
2016-05-04 03:02:01.001000+0000;0;
2016-05-04 03:02:01.002000+0000;0.1234;
2016-05-04 03:02:01.003000+0000;0.2345;
I have created a Cassandra table to store them and to be able to query them for various identified tasks. The Cassandra table looks like this :
cqlsh > CREATE KEYSPACE data with replication = {'class' : 'SimpleStrategy', 'replication_factor' : 3};
CREATE TABLE sensor_data (
station_id text, // id of the station
sensor_id text, // id of the sensor
tps timestamp, // timestamp of the measure
val float, // measured value
PRIMARY KEY ((station_id, sensor_id), tps)
);
I would like to use Apache Nifi to automatically store the data from the CSV into this Cassandra Table, but I can't find example or scheme to do it right. I have tried to use the "PutCassandraQL" processor, but I am struggling without any clear example. So, any help on how to execute a Cassandra put query with Apache Nifi to insert the data into the table would be appreciated !