18

Background

Im working on some kind of data logger.

I want to try how much storage space I need for 1000000 (1M) rows and how Raspberry Pi deals with such big table. I want to run some queries with grouping, calculating averages and other performance experiments.

My table looks like this:

CREATE TABLE `data` 
(
  `id`         bigint(20) NOT NULL      AUTO_INCREMENT,
  `datetime`   timestamp  NULL          DEFAULT CURRENT_TIMESTAMP,
  `channel`    int(11)                  DEFAULT NULL,
  `value`      float                    DEFAULT NULL,

  PRIMARY KEY (`id`)
)

Question

How can I fill it with 1000000 million rows in MySQL?

Requirements:

  • data.datetime field: random timestamps but only from one year
  • data.value field: random float numbers from given range (0.00-100.00 for example)
  • data.id is autoincrement, no need to care about that
  • data.channel is always 1, no need to care about that too

I know SQL a bit, but I'm not good in PL/SQL, loops in SQL etc.

EDIT:

To make it clear - im using MySQL 5.5.

Mentioned PL/SQL was my mistake, I thought PL/ stands for procedural features in SQL in general, not just Oracle.

Kamil
  • 13,363
  • 24
  • 88
  • 183
  • 6
    Why solve this problem with only sql? Use a simple scripting language like Perl or Python and insert data as you wish. – jmmeier Aug 02 '14 at 19:53
  • Do you need PL/SQL or MySQL? – Ashalynd Aug 02 '14 at 19:55
  • 1
    http://stackoverflow.com/questions/20775769/sql-update-random-between-two-dates – Ashalynd Aug 02 '14 at 19:57
  • 2
    PL/SQL is for Oracle, but your tag says MySQL. They're nowhere near the same thing, and they are in no way compatible. What specifically are you asking about? – Ken White Aug 02 '14 at 20:03
  • @jmmeier I thought it can be done in SQL with small effort. Im not good in perl or python. However im not bad in Excel... Maybe I will do that. Im not deleting question, looks like 2 people are interested. – Kamil Aug 02 '14 at 20:19
  • @RyanVincent I don't understand what is purpose of that comment... Im building kind of slow data logger. I don't need to report interesting stuff as soon as possible. I want to compare weeks, months, years etc. I want to see how fast Raspberry can calculate week, month, year averages from 1, 3, 5 or 10 minute samples and adjust my measurement frequency to Raspberry performance. – Kamil Aug 02 '14 at 20:32

2 Answers2

38

Try it with a stored procedure (replace 1000 with desired amount of rows, and 2014 with test year, also see generate random timestamps in mysql)

CREATE TABLE `data` 
(
  `id`         bigint(20) NOT NULL      AUTO_INCREMENT,
  `datetime`   timestamp  NULL          DEFAULT CURRENT_TIMESTAMP,
  `channel`    int(11)                  DEFAULT NULL,
  `value`      float                    DEFAULT NULL,

  PRIMARY KEY (`id`)
);


DELIMITER $$
CREATE PROCEDURE generate_data()
BEGIN
  DECLARE i INT DEFAULT 0;
  WHILE i < 1000 DO
    INSERT INTO `data` (`datetime`,`value`,`channel`) VALUES (
      FROM_UNIXTIME(UNIX_TIMESTAMP('2014-01-01 01:00:00')+FLOOR(RAND()*31536000)),
      ROUND(RAND()*100,2),
      1
    );
    SET i = i + 1;
  END WHILE;
END$$
DELIMITER ;

CALL generate_data();

Modify to your needs. To delete the procedure:

DROP PROCEDURE generate_data;

Maybe this can give you a start!

Jonny 5
  • 12,171
  • 2
  • 25
  • 42
  • What is purpose of delimiter? – Kamil Aug 02 '14 at 21:03
  • 2
    @Kamil Read more about [why changing the default delimiter](http://alextsilverstein.com/programming-and-development/mysql/mysql-the-reason-for-using-the-delimiter-statement-in-mysql-routines-stored-procedures-functions-triggers/) or here a tutorial [for getting started](http://www.mysqltutorial.org/getting-started-with-mysql-stored-procedures.aspx). – Jonny 5 Aug 02 '14 at 21:18
5

We have MySQL Random Data Generator - easy to use procedure for generating the random data internally from mysql itself.

mysql_user
  • 382
  • 3
  • 11