0

The question was asked earlier but it appears from the discussion that the question had insufficient input to determine output. I have a similar problem. I will try to come up with some spec/logic.

I have a table with timestamp data that I have converted to unix_timestamp.

id p_value ceil(unix_timestamp(updated_at))
3 300 1653549602
7 300 1653549902
11 300 1653550202
15 300 1653550502
19 300 1653550802
23 1200 1653551102
27 1300 1653551402
31 1300 1653551402
35 1300 1653551702
39 1300 1653551702

These are 10 rows with roughly equidistant times. And suppose I want N roughly equidistant rows. So I follow these steps for N = 3,

  1. divide the set by N - 1 i.e. (max - min)/(N - 1). I get 2100/2 = 1050
  2. pick first row (with timestamp 1653549602) save as last
  3. then pick (the first row with updated_at > (last + 1050)) i.e. with timestamp 1653550802 and save as last.
  4. repeat step 3 until it crosses max; use max as last sample. i.e. with timestamp 1653551702.

I have this rough algorithm but how to write this in SQL.

Sample output:

id p_value ceil(unix_timestamp(updated_at))
3 300 1653549602
19 300 1653550802
39 1300 1653551702
lemon
  • 14,875
  • 6
  • 18
  • 38
Pankaj Jangid
  • 524
  • 3
  • 18
  • what do you mean by "N (= 3) roughly equidistant". In that mean you need a set of records which is equidistant to that Id = 3 record or what? I may be little confused with your requirement. How did you identify the "equidistant" data set among your Data set (what is the logic)? – isatsara Jun 05 '22 at 07:21
  • Sorry about unclear explanation. I have updated the sentence. I want N rows. – Pankaj Jangid Jun 05 '22 at 07:32
  • "then pick (row with updated_at ≈ min + 1050) as last" <-- is this "as last" or "as second" ? can you show a sample of your output in a sample table or in a image – isatsara Jun 05 '22 at 08:14
  • `last` is a variable saving last output as-in loop variable in procedural programming. The output of the example should pick row with id 3, 19, 39. – Pankaj Jangid Jun 05 '22 at 08:42
  • What is your version of MySql? – forpas Jun 05 '22 at 10:57
  • @forpas, I am using mariadb 10.6 – Pankaj Jangid Jun 05 '22 at 11:23

1 Answers1

1

I just given a try. Check this can help you. Just try the function that i given.

  1. Your '1653549602' is not the last. It is the first record that saved to table.

1653549602 = 2022-05-26 07:20:02 <-- first record 7:20

and 1653551702 = 2022-05-26 07:55:02. <-- last record at 7:55

  1. Also i feel there is a logic issue in your described scenario while selecting the last record. Because 1653550802 + 1050 mean real time is --> "2022-05-26 07:57:32". So you cannot select "1653551702" as the record through this condition updated_at > (last + 1050)). 1653551702 = "2022-05-26 07:55:02". So your condition not valid with it.

1653550802 + 1050 = 1653551852 which is "2022-05-26 07:57:32" So this condition is not working [ "2022-05-26 07:55:02" > "2022-05-26 07:57:32" ]

[Start from here]

Anyway i did a procedure for you. It give you a some idea to your requirement and also it will help you to go forward.

I used the same table structure as

create table `equidistants` (
    `pid` int (11),
    `id` int (11),
    `p_value` int (11),
    `unix_time` bigint (20)
); 

pid is a column that i created as PK for me

Table name i used : equidistants

  1. Created Below function

      DROP PROCEDURE IF EXISTS my_proc_equidistant;
    
      DELIMITER $$
    
      CREATE PROCEDURE my_proc_equidistant(IN n_value INT)
      BEGIN
    
    
         DECLARE i_val INT; -- Variable for (max - min)/(N - 1)
         DECLARE i_loop INT DEFAULT 0;
         DECLARE i_Selected_unixTime INT;
    
         SET n_value = n_value -1;
    
         -- Handle the devided by 0 error
         IF n_value = 0 THEN 
             SET n_value = 1 ;
         END IF;
    
         -- (max - min)/(N - 1) calculate here
         SELECT  (MAX(unix_time) - MIN(unix_time))/(n_value) 
         INTO i_val FROM `equidistants` ; 
    
         -- Get the first updated value. Not the last one  
         SELECT unix_time INTO i_Selected_unixTime 
         FROM `equidistants`  ORDER BY unix_time ASC LIMIT 1; 
    
         -- Temporary table to keep your Data
         DROP TABLE IF EXISTS temp_equidistants; 
    
         -- Inser the latest record from the data set.
         CREATE TEMPORARY TABLE temp_equidistants  
         SELECT * FROM equidistants  ORDER BY unix_time ASC LIMIT 1;
    
          -- Start the loop based on the given N value
          WHILE i_loop < n_value DO
    
             -- Insert the next selected record into the temp table base on the [last selected unix time + i_val]   
             INSERT INTO temp_equidistants 
             SELECT  *  FROM equidistants WHERE unix_time > i_Selected_unixTime + i_val  ORDER BY  unix_time ASC LIMIT 1;
    
             -- identify the next unix time 
             SELECT  unix_time INTO i_Selected_unixTime FROM equidistants WHERE unix_time > i_Selected_unixTime + i_val  ORDER BY  unix_time ASC LIMIT 1;
    
             SET i_loop=i_loop+1;
           END WHILE;
    
           -- Execute the result you need
           SELECT * FROM temp_equidistants; 
    
           -- Drop the Temp table
           DROP TABLE IF EXISTS temp_equidistants; 
    
      END$$
    
      DELIMITER ;
    

Hope you can do something with this function by modifying some areas.

  1. Result that i got enter image description here

Note: 3rd record missing due to the condition miss match that i explain at the top

Here i used "ASC" for ther order by clause. You can change it to descending and you can run it other way-around.

isatsara
  • 2,065
  • 1
  • 11
  • 11
  • I had mentioned in the 4th point that if the max value exceed the last row then select last row. So I added this check before the final select statement in your procedure, IF (SELECT COUNT(*) FROM temp_equidistants) <= n_value THEN INSERT INTO temp_equidistants SELECT * FROM `equidistants` ORDER BY unix_time DESC LIMIT 1; END IF; – Pankaj Jangid Jun 06 '22 at 04:39