If there are multiple records on a specific date I want to remove all but latest record on that day. For example in below table records with id 9, 10, 12 have same date. So 9 and 10 should be removed as record with id 12 is having latest date.
id date
1 2012-04-25 00:00:00.000000
2 2012-04-26 00:00:00.000000
3 2012-04-23 00:00:00.000000
4 2012-04-24 00:00:00.000000
5 2012-05-01 00:00:00.000000
6 2012-05-02 00:00:00.000000
7 2012-05-03 00:00:00.000000
8 2012-05-04 00:00:00.000000
9 2012-04-30 00:30:00.000000
10 2012-04-30 18:00:00.000000
11 2012-04-29 00:00:00.000000
12 2012-04-30 18:40:00.000000
13 2012-05-05 00:00:00.000000
14 2012-05-05 09:31:31.000000
Here is (dirty)rake task to remove duplicates
task :remove_duplicate do
Rake::Task["remove_duplicate"].invoke
end
task :remove_duplicate => :environment do
weights = Weight.count(:group => "DATE(date)", :having => "COUNT(id) > 1")
weights_to_delete = []
weights.each do |weight|
start_date = weight[0].to_date.beginning_of_day
end_date = weight[0].to_date.end_of_day
day_weights = Weight.where("date >= ? and date <= ?", start_date, end_date).order(:date)
day_weights[0..-2].each do |weight|
weights_to_delete.push weight.id
end
end
Weight.delete(weights_to_delete)
end
Though I am able to remove records as I explained, I am not satisfied with the approach I take. Please guide me to remove duplicate records on specific date keeping latest one only utilizing ActiveRecord API better way.
Thanks, Amit Patel