1

I'm using a mysql database to store huge amount of satellite data, and these datasets has many data-gaps. I would like to replace the NULL values with an 1 hour(or less) average around that point. So far I've found how to replace the NULL value with the previous known value:

UPDATE mytable
SET number = (@n := COALESCE(number, @n))
ORDER BY date;

from this post: SQL QUERY replace NULL value in a row with a value from the previous known value

My table looks like

+---------------------+--------+
| date                | P_f    |
+---------------------+--------+
| 2001-01-01 20:20:00 |   1.88 | 
| 2001-01-01 20:25:00 |   NULL | 
| 2001-01-01 20:30:00 |   NULL | 
| 2001-01-01 20:35:00 |   1.71 | 
| 2001-01-01 20:40:00 |   NULL | 
| 2001-01-01 20:45:00 |   NULL | 
| 2001-01-01 20:50:00 |   NULL | 
| 2001-01-01 20:55:00 |  1.835 | 
| 2001-01-01 21:00:00 |  1.918 | 
| 2001-01-01 21:05:00 |  1.968 | 
| 2001-01-01 21:10:00 |  2.004 | 
| 2001-01-01 21:15:00 |  1.924 | 
| 2001-01-01 21:20:00 | 1.8625 | 
| 2001-01-01 21:25:00 |   1.94 | 
| 2001-01-01 21:30:00 | 2.0375 | 
| 2001-01-01 21:35:00 |  1.912 | 

I'd like to replace the NULL values with average values around that datetime. For instance I'd like to replace ,

| 2001-01-01 20:50:00 |   NULL |

with an average around

select AVG(P_f) from table where date between '2001-01-01 20:30' and '2001-01-01 21:10';

Paul

Community
  • 1
  • 1
Paulten
  • 59
  • 1
  • 8

1 Answers1

0

Not the most elegant I admit but it should get you what you want.

I'm not sure how you want to handle those NULL values that have an hour average of NULL. In the example below these will get updated to -1.

create table myTable
(myDate datetime not null,
P_f decimal(10,5) default null
);

insert into myTable(myDate,P_f) values ('2001-01-01 20:20:00',1.88);
insert into myTable(myDate,P_f) values ('2001-01-01 20:25:00',NULL);
insert into myTable(myDate,P_f) values ('2001-01-01 20:30:00',NULL);
insert into myTable(myDate,P_f) values ('2001-01-01 20:35:00',1.71);
insert into myTable(myDate,P_f) values ('2001-01-01 20:40:00',NULL);
insert into myTable(myDate,P_f) values ('2001-01-01 20:45:00',NULL);
insert into myTable(myDate,P_f) values ('2001-01-01 20:50:00',NULL);
insert into myTable(myDate,P_f) values ('2001-01-01 20:55:00',1.835);
insert into myTable(myDate,P_f) values ('2001-01-01 21:00:00',1.918);
insert into myTable(myDate,P_f) values ('2001-01-01 21:05:00',1.968);
insert into myTable(myDate,P_f) values ('2001-01-01 21:10:00',2.004);
insert into myTable(myDate,P_f) values ('2001-01-01 21:15:00',1.924);
insert into myTable(myDate,P_f) values ('2001-01-01 21:20:00',1.8625);
insert into myTable(myDate,P_f) values ('2001-01-01 21:25:00',1.94);
insert into myTable(myDate,P_f) values ('2001-01-01 21:30:00',2.0375);
insert into myTable(myDate,P_f) values ('2001-01-01 21:35:00',1.912);
insert into myTable(myDate,P_f) values ('2001-01-02 20:40:00',NULL);

-- Insert copy of null value P_f rows into myTable with 1 hour average about myDate 
insert into myTable
(myDate,P_f)
select t.myDate,ifnull((select avg(P_f) from myTable t1 where t1.myDate between t.myDate - interval 1 hour and t.myDate +interval 1 hour),-1) as hourAvg
from myTable t
where t.P_f is null;

-- delete rows where P_f is null
delete from myTable
where P_f is null;
Tom Mac
  • 9,693
  • 3
  • 25
  • 35