I have a table like that:
CREATE TABLE cache (
id BIGSERIAL PRIMARY KEY,
source char(2) NOT NULL,
target char(2) NOT NULL,
q TEXT NOT NULL,
result TEXT,
profile TEXT NOT NULL DEFAULT '',
created TIMESTAMP NOT NULL DEFAULT now(),
api_engine text NOT NULL,
encoded TEXT NOT NULL
);
I want to pass over the list of encoded field (maybe OVER ... WINDOW ?) with something like:
SELECT id, string_agg(encoded, '&q=') FROM cache
so I will have the list of corresponding ids, and a string of concatenated fields encoded: '&q=encoded1&q=encoded2&q=encoded3'
... with total length not exceeding some limit (like not more than 2000 chars).
The second condition, I want to go to the next window, when one of those fields: source, target or profile are changed.
If that possible with SQL SELECT in FOR LOOP?
I know how to do that with plpgsql/plpython/plperl, but I want to optimize this request.
FOR rec IN
SELECT array_agg(id) AS ids, string_agg(encoded, '&q=') AS url FROM cache
WHERE result IS NULL
ORDER BY source, target
LOOP
-- here I call curl with that *url*
Example data:
INSERT INTO cache (id, source, target, q, result, profile, api_engine, encoded) VALUES
(1, 'ru', 'en', 'Длинная фраза по-русски' , NULL, '', 'google', '%D0%94%D0%BB%D0%B8%D0%BD%D0%BD%D0%B0%D1%8F+%D1%84%D1%80%D0%B0%D0%B7%D0%B0+%D0%BF%D0%BE-%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%B8')
, (2, 'ru', 'es', 'Ещё одна непонятная фраза по-русски', NULL, '', 'google', '%D0%95%D1%89%D1%91+%D0%BE%D0%B4%D0%BD%D0%B0+%D0%BD%D0%B5%D0%BF%D0%BE%D0%BD%D1%8F%D1%82%D0%BD%D0%B0%D1%8F+%D1%84%D1%80%D0%B0%D0%B7%D0%B0+%D0%BF%D0%BE-%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%B8')
-- etc...
and so on, 100500 rows like that. Fields source and target can be different language codes, and they repeat, so I need maybe to do GROUP BY source, target, profile
.
I want to SELECT first N rows, where concatenation of the field encoded with some delimiter like
&q=%D0%94%D0%BB%D0%B8%D0%BD%D0%BD%D0%B0%D1%8F+%D1%84%D1%80%D0%B0%D0%B7%D0%B0+%D0%BF%D0%BE-%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%B8&q=%D0%95%D1%89%D1%91+%D0%BE%D0%B4%D0%BD%D0%B0+%D0%BD%D0%B5%D0%BF%D0%BE%D0%BD%D1%8F%D1%82%D0%BD%D0%B0%D1%8F+%D1%84%D1%80%D0%B0%D0%B7%D0%B0+%D0%BF%D0%BE-%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%B8
so the length of this concatenated string is not more than (2000) chars. So I will have that string, and also all ids of those rows, included in url (in the same order, sure).
And then I want to select next N rows with the same criteria, and so on.