2

I am trying to build a function that gives back a percentage of rows that have a particular condition.

Here's my attempt:

CREATE OR REPLACE FUNCTION osm_prozent ()
RETURNS integer AS $$
DECLARE 
      a integer;
      b integer;
      ergebnis integer;
BEGIN
  a = select into a count(*)
    from osm_street;
  b = select into b count(*)
    from osm_street
    where hausnummer like '%-%'
    or hausnummer like '% %' 
    or hausnummer like '%/%' 
    or hausnummer like '%;%'
    or hausnummer like '%,%'
    or hausnummer ~ '([a-z,A-Z])';  

  ergebnis = (b/a)*100;
  return ergebnis;
END;
$$ LANGUAGE plpgsql;

Variable a is 207000 and variable b is 11000. At the end I should have something like 5.31%.

Can someone help to write this function correctly?

Bohemian
  • 412,405
  • 93
  • 575
  • 722
Andrew Ramnikov
  • 783
  • 2
  • 9
  • 30
  • if you use (b/a) * 100 u will get 5.31. In your calculation b is higher than a how you will be get 6 % – Sathish Mar 31 '16 at 09:10

2 Answers2

3

You can make two clear improvements, in order of importance:

  1. Make 1 pass over the table instead of 2 passes
  2. Simplify where condition of your second query

Dealing with the 2nd item first:

The where condition can be changed to have just one comparison using just one regex test. This expression:

where hausnummer like '%-%'
or hausnummer like '% %' 
or hausnummer like '%/%'
or hausnummer like '%;%'
or hausnummer like '%,%'
or hausnummer ~ '([a-z,A-Z])'

can be expressed as simply:

where hausnummer ~ '[a-z,A-Z /;-]'

Next, to make just one pass over the table, use sum() on the condition to capture the number of hits at the same time as you capture count(*). Using the above improvement as the condition, your query can become:

select 100.0 * sum(case when hausnummer ~ '[a-z,A-Z /;-]' then 1 end) / count(*)
from osm_street

Unless you need a store procedure, you can now dispense with it entirely, as you have the result in one simple query.

If you desperately require a stored procedure (not recommended), it would be just a thin wrapper (adding no value) around the above query:

CREATE OR REPLACE FUNCTION osm_prozent () RETURNS double precision AS $$
BEGIN
  RETURN select 100.0 * sum(case when hausnummer ~ '[a-z,A-Z /;-]' then 1 end) / count(*) from osm_street;
END;
$$ LANGUAGE plpgsql;

Note that I suspect that your last condition hausnummer ~ '([a-z,A-Z])' is not what you intend. Firstly, the brackets are redundant and may be removed without changing the meaning, ie it's the same as hausnummer ~ '[a-z,A-Z]' and is true if hausnummber contains a letter or a comma.

If that is not what you intend, leave a comment to this answer explaining what you actually want.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • What the OP actually wants should never just go into a comment. It should be in the question. And I assume by "store procedure" you mean stored procedure? Postgres doesn't have stored procedures, just functions which are almost, but not quite the same. – Erwin Brandstetter Apr 01 '16 at 08:13
  • @erwin I respect your database knowledge very much, but I think you're splitting hairs saying the PG "doesn't have stored procedures"; the [doc](http://www.postgresql.org/docs/9.0/static/plpgsql-overview.html) says *PL/pgSQL ... can be used to create functions and trigger procedures*, and they are stored, so they are stored procedures. Also, "stored procedures" is the industry's generic term for such things. Further, there is no chance IMHO that calling these things stored procedures in the context of Postgres would be misleading or lead to confusion. – Bohemian Apr 01 '16 at 08:32
  • The important distinction is that Postgres functions (as opposed to real stored procedures) are always run inside a single transaction so cannot begin or commit transactions and cannot run commands like `VACUUM` or `CREATE INDEX CONCURRENTLY` (which cannot run within a transaction block). More: http://stackoverflow.com/a/28251922/939860 – Erwin Brandstetter Apr 01 '16 at 08:58
0

Your filter by regular expression mixed with several likes can be simplified to a single regular expression (also note that your regular expression adds unnecessary load by adding brackets, and the comma between a-z and A-Z is not a separator but matches a comma as well, which you also included TWICE! in your list of like comparisons). A much simpler function code would be:

CREATE OR REPLACE FUNCTION osm_prozent () RETURNS double precision AS $$
  SELECT 100. * (SELECT COUNT(*) FROM osm_street WHERE hausnummer ~* '[a-z /;,-]') / (SELECT COUNT(*) FROM osm_street);
$$ LANGUAGE sql;
Ezequiel Tolnay
  • 4,302
  • 1
  • 19
  • 28
  • Hi, i am getting this error:ERROR: operator does not exist: text *~ unknown LINE 2: ...(SELECT COUNT(*) FROM osm_street WHERE hausnummer *~ '[a-z /... ^ HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. – Andrew Ramnikov Mar 31 '16 at 10:26
  • Sorry, typo (I had put `*~` instead of `~*`). I've corrected it now in the code. `~` uses a case-sensitive regular expression match, whilst `~*` uses a case-insensitive regular expression match :) – Ezequiel Tolnay Apr 01 '16 at 01:41