0

I have a table with example data as shown below.

word       | last_seen  | first_seen | count
-----------|------------|------------|------
definition | 2014-09-08 | 2012-01-02 | 15
definition | 2014-10-11 | 2013-05-12 | 35
attribute  | 2013-07-23 | 2010-06-29 | 22

I'm wanting to to an in-place aggregation of the data, hopefully just using SQL, where the data for repeated words is such that I end up with MAX(last_seen), MIN(first_seen), and SUM(count).

word       | last_seen  | first_seen | count
-----------|------------|------------|------
definition | 2014-10-11 | 2012-01-02 | 50
attribute  | 2013-07-23 | 2010-06-29 | 22

I know I can see the results of the aggregation with the following:

SELECT 
  word, 
  MAX(last_seen) AS last_seen, 
  MIN(first_seen) AS first_seen, 
  SUM(count) AS count 
FROM 
  words 
GROUP BY word;

However, I don't just want to see the resulting aggregation... I want to actually update the words table, replacing the rows that have duplicate word column entries with the aggregated data.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Bryan
  • 2,205
  • 1
  • 23
  • 43
  • 1
    I dont see why your code isnt giving you what you want. can you elaborate on `I just don't know how to update the words table in-place with the results` – Matt Dec 10 '14 at 14:37
  • You have at least two statements, one to update, the other to delete the now irrelevant record. – crthompson Dec 10 '14 at 14:43
  • Are you sure you want to update the existing table, or do you just want a view with the aggregated data? What if it changes again? – sgeddes Dec 10 '14 at 14:44
  • You can do a combined select, update and delete in one query using the with clause: http://www.postgresql.org/docs/9.3/static/sql-select.html – Wolph Dec 10 '14 at 15:04

2 Answers2

1

As far as I'm aware there is no "Edit in place" in Postgresql (or any other traditional RDBMS that I can think of). Instead:

  1. Take the results of your query and dump them into a temp table: CREATE TEMP TABLE <temptable> AS <Your Query> WITH DATA
  2. Delete out everything in your word table: TRUNCATE word; <--This is the scary part so make sure you are cool with your query before truncating.
  3. Insert the records in your temp table into the now empty word table: INSERT INTO word SELECT * FROM <temptable>;
  4. Optionally: Drop your temp table DROP TABLE <temptable>; (being a temp table it will drop automagically when you end your session, but I'm a fan of being explicit)
JNevill
  • 46,980
  • 4
  • 38
  • 63
0

Actually you can do that in a single statement using data-modifying CTEs:

WITH del AS (
   DELETE FROM words w
   WHERE EXISTS (
      SELECT 1 
      FROM   words w1
      WHERE  w1.word = w.word
      AND    w1.ctid <> w.ctid
     )
   RETURNING *
   )
INSERT INTO words(word, last_seen, first_seen, count)
SELECT word, MAX(last_seen), MIN(first_seen), SUM(count)
FROM   del
GROUP  BY word;

Should be rather efficient, too.

SQL Fiddle.

About ctid:

About CTEs:

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228