Hi I have a postgresql "TABLE1" with 2.7 million records. Each record has a field "FIELD1" that may be empty or may have data. I want a SELECT statement or method that a) returns the first 1000 results from TABLE1 with FIELD1 empty, and b) randomly pick one of the records to return to a python variable. Help???
Asked
Active
Viewed 798 times
0
-
2This question might be related: http://stackoverflow.com/questions/8674718/best-way-to-select-random-rows-postgresql – svk Jul 02 '13 at 14:40
-
You need to define: "empty", "first", your requirements (speed? repeated use?). Please supply a table definition. – Erwin Brandstetter Jul 02 '13 at 16:32
2 Answers
2
For selecting first 1000 result you can use limit in your query
SELECT field1 FROM table1 ORDER BY id Limit 1000;
The result will be a list in python. So you can use python random module to operate on the result list.

Ansuman Bebarta
- 7,011
- 6
- 27
- 44
-
1How does postgresql choose the 1000? Is it in order of primary key number? If yes, how can I get 1000 random records? – user2104778 Jul 02 '13 at 14:22
-
You can add ORDER BY clause to query also. I have edited my answer. Default order by has to be primary id but not always guaranted to be ordered. – Ansuman Bebarta Jul 02 '13 at 15:01
1
If performance is not a concern:
SELECT *
FROM (
SELECT *
FROM tbl
WHERE field1 IS NULL
ORDER BY id --?? unclear from question
LIMIT 1000
) sub
ORDER BY random()
LIMIT 1;
This returns 1 perfectly random row from the "first" 1000 empty rows.
"Empty" meaning NULL
, and "first" meaning smallest id
.
If performance is a concern, you need to be a lot more specific.
If your circumstances match, this related answer might of help:
Best way to select random rows PostgreSQL

Community
- 1
- 1

Erwin Brandstetter
- 605,456
- 145
- 1,078
- 1,228