2

I have a table in postgres containing parsed address information. When I run the following query, I get back correct results:

select count(*) from address_table
  where (mod(to_number(to_address_left, '99999999'), 2) = 0 
         and to_address_left <> ' ')

But when I add an OR to it...

select count(*) from address_table
  where (mod(to_number(to_address_left, '99999999'), 2) = 0 
         and to_address_left <> ' ')
     or (from_address_left <> ' ')

I get the following error...

ERROR:  invalid input syntax for type numeric: " "
SQL state: 22P02

I believe this is an error associated with the TO_NUMBER function. I know there are blank spaces (intentionally) in some of these records, which is why the "and to_address_left <> ' '" is included. But I don't understand why the addition of the OR statement that references a completely different field is causing the otherwise self-contained TO_NUMBER function to generate the error.

This code works fine in Oracle SQL. Why doesn't it work here?

geospatial
  • 39
  • 6

1 Answers1

0

SQL is a declarative language. The query planner is free to evaluate expressions in any order it likes as long as it produces the right result. So, if you add an OR clause, it may generate a query plan that evaluates to_number first, and to_address_left <> ' ' second. This would raise an exception for rows with a space in to_address_left.

The consistent way around this is making sure the to_number evaluation never raises an exception. It looks like to_number succeeds for any string that contains a number, even 'b 1 d', so you could test that in a case:

to_number(case when str_col ~ '[0-9]' then str_col end, '99999999')

This way, Postgres will evaluate to_number without raising an exception, even if the row is later thrown out by other parts of the where clause.

An alternative way that relies on Postgres materializing a common-table expression (which all Postgres versions so far do, but it's not in the SQL standard):

with    CTE as
        (
        select  *
        from    YourTable
        where   str_col not like '% %'
        )
select  *
from    CTE
-- Line below below guaranteed not to run for rows with a space in str_col
where   to_number(str_col) > 42
Andomar
  • 232,371
  • 49
  • 380
  • 404
  • First, I'm new to pgSQL. I'm currently in the process of taking existing Oracle SQL that works fine and migrating the data and all legacy code to a postgres environment. So I still don't understand this answer as an explanation for generic SQL when it works in one flavor but not another. – geospatial May 01 '18 at 18:28
  • Second, if I change the OR statement to something else trivial like: OR (1 = 1) ... it never fails. I was under the impression that evaluation within parentheses is separate from any other evaluations. Also, please note that the OR clause references a different field entirely... from_address_left is not to_address_left. Does that affect the response any? – geospatial May 01 '18 at 18:33
  • @geospatial: Every database works this way. See [this Oracle question](https://stackoverflow.com/questions/2785033/getting-weird-issue-with-to-number-function-in-oracle) and note the `/*+NO_MERGE*/` to force the query planner to be consistent. According to the SQL spec, the query planner is free to evaluate the `2=2` part of `1=1 OR (1=0 AND 2=2)`, and it sometimes does so – Andomar May 01 '18 at 18:59
  • After reading that other question, I think it makes a little more sense, though intuitively it's a bit wonky. Am I correct in assuming that the reason it may work in Oracle and not in postgres is a pretty arbitrary decision by the optimizer in each of those environments in the order of evaluating all the WHERE statements? Could it conceivably fail in Oracle and work in postgres in slightly different circumstances? And thank you for your assistance so far. The code you provided seems to be working. – geospatial May 01 '18 at 20:37
  • Yeah, it could fail in Oracle and work in Postgres. It can even depend on minor database version, the speed of the disks, or the amount of RAM in the server. Anything can change the query planner's behavior – Andomar May 01 '18 at 20:43