0

I have a system to store and display the MQTT Messages that came from IoT to Server. The data comes in every 5 seconds and there are multiple data types in a single message. Therefore, I have thought that I can create new tables when a new message comes based on the IoT device serial number. Here is my issue that every day I receive around a million data per device and I have hundreds of devices. So what is the best way for me to store this data and create graphs when queried?

To have a more clear picture for you here is my table structures:

CREATE TABLE IF NOT EXISTS `mqttpacket_',device_serial_number,'`(
`id` int(11) NOT NULL AUTO_INCREMENT, 
`data_type_id` int(11) DEFAULT NULL, 
`data_value` int(11) DEFAULT NULL,
`inserted_time` TIMESTAMP(6) NOT NULL DEFAULT NOW(), 
PRIMARY KEY(`id`), 
FOREIGN KEY(data_type_id) REFERENCES datatypes(id)) 
COLLATE='LATIN1_SWEDISH_CI' ENGINE = InnoDB;

Here is my stored procedure:

CREATE DEFINER=`root`@`localhost` PROCEDURE `SP_NewTest`(
    IN `incoming_data` TEXT,
    IN `value_array` TEXT,
    IN `device_serial_number` VARCHAR(50),
    IN `data_timestamp` VARCHAR(50),
    OUT `result_id` INT,
    OUT `result_message` VARCHAR(500)
)
BEGIN

DECLARE value_iteration VARCHAR(50);
DECLARE lcl_data_type_id INT;
DECLARE arr_data_type_name VARCHAR(50);
DECLARE affected_data_row_count INT;
DECLARE lcl_insert_mqtt_query VARCHAR(500);
DECLARE create_table_query VARCHAR(500);

DECLARE exit handler FOR 40001
BEGIN
    GET DIAGNOSTICS CONDITION 1
    @p1 = MYSQL_ERRNO, @p2 = MESSAGE_TEXT;
    SET result_id = -2;
    SET result_message = CONCAT('RETURNED ERROR NO : ', @p1 , '\r\nMESSAGE TEXT : ',@p2);
ROLLBACK;
END;

DECLARE exit handler FOR sqlexception
BEGIN
    GET DIAGNOSTICS CONDITION 1
    @p1 = MYSQL_ERRNO, @p2 = MESSAGE_TEXT;
    SET result_id = -999;
    SET result_message = CONCAT('RETURNED ERROR NO : ', @p1 , '\r\nMESSAGE TEXT : ',@p2);
ROLLBACK;
END;

    START TRANSACTION;

    SET affected_data_row_count = 0;
    SET result_id = 0;
    SET result_message = 'Success';
    

    IF(SELECT count(id) FROM devices WHERE devices.serial_number = device_serial_number) > 0 THEN

        SET create_table_query = CONCAT('CREATE TABLE IF NOT EXISTS `mqttpacket_',device_serial_number,'`(`id` int(11) NOT NULL AUTO_INCREMENT, `data_type_id` int(11) DEFAULT NULL, `data_value` int(11) DEFAULT NULL,`inserted_time` TIMESTAMP(6) NOT NULL DEFAULT NOW(), PRIMARY KEY(`id`), FOREIGN KEY(data_type_id) REFERENCES datatypes(id)) COLLATE=\'LATIN1_SWEDISH_CI\' ENGINE = InnoDB;');
        PREPARE stmt FROM create_table_query;
        EXECUTE stmt;
        DEALLOCATE PREPARE stmt;
    
        WHILE (LOCATE(',', value_array) > 0)
        DO
    
            SET arr_data_type_name = SUBSTRING_INDEX(value_array,',',1);
            SET value_array = SUBSTRING(value_array, LOCATE(',',value_array) + 1);
    
            SELECT JSON_EXTRACT(incoming_data, arr_data_type_name) INTO value_iteration;
            
            SET arr_data_type_name := SUBSTRING_INDEX(arr_data_type_name, ".", -1);
            
                IF (SELECT COUNT(id) FROM datatypes WHERE datatypes.data_name = arr_data_type_name) > 0 THEN 
                    SELECT id INTO lcl_data_type_id FROM datatypes WHERE datatypes.data_name = arr_data_type_name LIMIT 1;
                ELSE
                    INSERT INTO datatypes (datatypes.data_name,datatypes.description ,datatypes.inserted_time) VALUES(arr_data_type_name,arr_data_type_name,NOW());
                    SELECT id INTO lcl_data_type_id FROM datatypes WHERE datatypes.data_name = arr_data_type_name LIMIT 1;
                END IF;
                
                SET lcl_insert_mqtt_query = CONCAT('INSERT INTO mqttpacket_',device_serial_number,'(data_type_id,data_value,inserted_time) VALUES(',lcl_data_type_id,',',value_iteration,',\'',data_timestamp,'\');');
                PREPARE stmt FROM lcl_insert_mqtt_query;
                EXECUTE stmt;
                
        
                SET affected_data_row_count = affected_data_row_count +1;
                
        END WHILE;
        SET result_id = 0;
        SET result_message = CONCAT('Query performed successfully. Effected rows :  ',CAST(affected_data_row_count AS CHAR));

    ELSE

        SET result_id = -1;
        SET result_message = 'Customer Serial Number not found.';
    END IF;
        

COMMIT;
END

I basically get the serial number of the device and if their tables don't exist in the database, create the table for them and parse the JSON to insert into tables inside the stored procedure.

CREATE TABLE `devices` (
    `id` INT(11) NOT NULL AUTO_INCREMENT,
    `serial_number` VARCHAR(50) NOT NULL COLLATE 'latin1_swedish_ci',
    `device_type_id` INT(11) NULL DEFAULT NULL,
    `customer_id` INT(11) NULL DEFAULT NULL,
    `inserted_time` TIMESTAMP(6) NOT NULL DEFAULT current_timestamp(6),
    PRIMARY KEY (`id`) USING BTREE,
    INDEX `device_type_id` (`device_type_id`) USING BTREE,
    INDEX `customer_id` (`customer_id`) USING BTREE,
    CONSTRAINT `devices_ibfk_1` FOREIGN KEY (`device_type_id`) REFERENCES `templaritest`.`devicetypes` (`id`) ON UPDATE RESTRICT ON DELETE RESTRICT,
    CONSTRAINT `devices_ibfk_2` FOREIGN KEY (`customer_id`) REFERENCES `templaritest`.`customers` (`id`) ON UPDATE RESTRICT ON DELETE RESTRICT
)
CREATE TABLE `devicetypes` (
    `id` INT(11) NOT NULL AUTO_INCREMENT,
    `device_type_name` VARCHAR(50) NOT NULL COLLATE 'utf8mb4_general_ci',
    PRIMARY KEY (`id`) USING BTREE
)
CREATE TABLE `datatypes` (
    `id` INT(11) NOT NULL AUTO_INCREMENT,
    `data_name` VARCHAR(50) NOT NULL COLLATE 'latin1_general_ci',
    `inserted_time` TIMESTAMP(6) NOT NULL DEFAULT NOW(),
    `description` VARCHAR(50) NULL DEFAULT NULL COLLATE 'latin1_general_ci',
    `device_type_id` INT(11) NULL DEFAULT NULL,
    PRIMARY KEY (`id`) USING BTREE,
    INDEX `device_id` (`device_id`) USING BTREE,
    CONSTRAINT `datatypes_ibfk_1` FOREIGN KEY (`device_type_id`) REFERENCES `test_database`.`devicetypes` (`id`) ON UPDATE RESTRICT ON DELETE RESTRICT
)
CREATE TABLE `customers` (
    `id` INT(11) NOT NULL AUTO_INCREMENT,
    `customer_name` VARCHAR(50) NOT NULL COLLATE 'latin1_swedish_ci',
    `inserted_time` TIMESTAMP(6) NOT NULL DEFAULT current_timestamp(6),
    PRIMARY KEY (`id`) USING BTREE
)

I would be grateful if you can suggest to me how can I optimize the database based on the Size and the query's efficiencies.

Taylan Yuksel
  • 345
  • 3
  • 12
  • We cannot tell you what's the best way to design your database for your particular needs as it depends on a lot of factors. What the answers to the duplicate question describe is the different ways to design an extensible database structure within mysql. It talks about products and not devices, but the design concepts are the same. The other option you should think about is using a NoSQL database to store the data as opposed to MySQL. – Shadow Mar 01 '21 at 12:02
  • 2
    With respect, **do not create a new table for each device.** If you do you **will be sorry**. Put a `device_id` column in your messages table that's a reference to `devices.id` and store all messages in one table. Why? What happens when you have ten thousand devices and you want to search them all for a particular type of message? – O. Jones Mar 01 '21 at 12:03
  • 1
    @Shadow, you're right we can't tell people how to design their databases. But we can make this true statement: A design with thousands of identically-laid-out tables is not a good design. – O. Jones Mar 01 '21 at 12:05
  • @O.Jones Thanks for your kind suggestion. But I first designed as you said and had a single table for all the data. But the thing is here is It will have more than 15million data per day. So, that means when I want to draw the chart, querying from a single table will take a huge amount of time. **Your point is correct about querying as per the particular message type.** I am trying to find the best solution tho. That's why trying every way. Thanks. – Taylan Yuksel Mar 01 '21 at 12:09
  • @O.Jones I would leave this to the OP to evaluate the 5 design options laid out by the excellent answer by Bill Karwin and choose the best one for this particular project! We do not have enough information to make objective suggestions. A lot depends on knowing the data coming into the database. An sql-based solution may not even be the best one. – Shadow Mar 01 '21 at 12:11
  • @Shadow I respect your answer but here I have a problem with designing and kindly asking for some suggestions which shouldn't be a big problem. Thanks for your suggestion about using a NoSQL. I will consider it. – Taylan Yuksel Mar 01 '21 at 12:11
  • @Shadow Thanks again. I will check the mentioned question. – Taylan Yuksel Mar 01 '21 at 12:12
  • 15 million messages per day is large, but not overly so. It's well within the capabilities of MySQL. You will need to index your large message table to support the queries you need, or your queries will indeed be slow. Read Marcus Winands's excellent e-book: https://use-the-index-luke.com/ – O. Jones Mar 01 '21 at 12:12
  • @TaylanYuksel this is why I closed the question as a duplicate as you need suggestions on designing an extensible data model within an sql database. Bill Karwin's answer lists the available design options out for you. As you can see, Bill kept his answer updated over the years. You need to consider the options (suggestions) and choose the one most appropriate for your project. Your current design is a special case of EAV pattern and Bill's answer deals with it quite extensively. – Shadow Mar 01 '21 at 12:17
  • 17K rows per day and a couple of dozen columns == nice design. – Rick James Mar 01 '21 at 21:28

0 Answers0