0

I want this postgres function to work :

CREATE OR REPLACE FUNCTION difference_of_match_ids_in_match_history_and_match_results() 
  returns table(match_id BIGINT) 
as 
$$ 
BEGIN   
  return QUERY 
  SELECT * 
  FROM sports.match_history     
  WHERE match_id NOT IN (SELECT match_id 
                         FROM sports.match_results); 
END $$   
LANGUAGE 'plpgsql';

This stand alone query works just fine:

SELECT * 
FROM sports.match_history   
WHERE match_id NOT IN (SELECT match_id FROM sports.match_results);

But when I put it into this function and try to run it like this:

select * 
from difference_of_match_ids_in_match_history_and_match_results();

I get this:

SQL Error [42702]: ERROR: column reference "match_id" is ambiguous
Detail: It could refer to either a PL/pgSQL variable or a table column. Where: PL/pgSQL function difference_of_match_ids_in_match_history_and_match_results() line 3 at RETURN QUERY

I've seen other questions with this same error, and they suggest naming the sub queries to specify which instance of a column you're referring to, however, those examples use joins and my query works fine outside of the function.

If I do need to name the column, how would I do so with only one sub-query?

If that isn't the issue, then I'm assuming that there's something wrong with the way I'm defining a function.

Narwhal
  • 325
  • 3
  • 15

4 Answers4

1

You query is fine. The ambiguity is on the match_id in returns table(match_id BIGINT) rename it or prefix the columns with the table name in your query

CREATE OR REPLACE FUNCTION difference_of_match_ids_in_match_history_and_match_results() 
  returns table(new_name BIGINT) 
as 
$$ 
BEGIN   
  return QUERY 
  SELECT * 
  FROM sports.match_history     
  WHERE match_id NOT IN (SELECT match_id 
                         FROM sports.match_results); 
END $$   
LANGUAGE 'plpgsql';

or

CREATE OR REPLACE FUNCTION difference_of_match_ids_in_match_history_and_match_results() 
  returns table(match_id BIGINT) 
as 
$$ 
BEGIN   
  return QUERY 
  SELECT sports.match_history.match_id
  FROM sports.match_history     
  WHERE sports.match_history.match_id NOT IN (SELECT sports.match_results.match_id 
                         FROM sports.match_results); 
END $$   
LANGUAGE 'plpgsql';

Didn't test the code.

Akli REGUIG
  • 552
  • 4
  • 13
1

The structure of the result set must match the function result type. If you want to get only match_ids:

CREATE OR REPLACE FUNCTION difference_of_match_ids_in_match_history_and_match_results() 
  RETURNS TABLE(m_id BIGINT)    -- !!
AS 
$$ 
BEGIN   
  RETURN QUERY 
  SELECT match_id               -- !!
  FROM sports.match_history     
  WHERE match_id NOT IN (SELECT match_id 
                         FROM sports.match_results); 
END $$   
LANGUAGE 'plpgsql';

If you want to get whole rows as a result:

DROP FUNCTION difference_of_match_ids_in_match_history_and_match_results();
CREATE OR REPLACE FUNCTION difference_of_match_ids_in_match_history_and_match_results() 
  RETURNS SETOF sports.match_history    -- !!
AS
$$ 
BEGIN   
  RETURN QUERY 
  SELECT *                              -- !!
  FROM sports.match_history     
  WHERE match_id NOT IN (SELECT match_id 
                         FROM sports.match_results); 
END $$   
LANGUAGE 'plpgsql';
klin
  • 112,967
  • 15
  • 204
  • 232
1

As others have answerd, it's an ambiguity between the result definition and PL/pgSQL variables. The column name in a set returning function is in fact also a variable inside the function.

But you don't need PL/pgSQL for this in the first place. If you use a plain SQL function it will be more efficient and the problem will go away as well:

CREATE OR REPLACE FUNCTION difference_of_match_ids_in_match_history_and_match_results() 
  returns table(match_id BIGINT) 
as 
$$ 
  SELECT match_id --<< do not return * - only return one column
  FROM sports.match_history     
  WHERE match_id NOT IN (SELECT match_id 
                         FROM sports.match_results); 
$$   
LANGUAGE sql;

Note that the language name is an identifier and should not be quoted at all.

0

The naming conflict between column names and plpgsql OUT parameters has been addressed. More details here:

I would also use a different query style. NOT IN (SELECT ...) is typically slowest and carries traps with NULL values. Use NOT EXISTS instead:

SELECT match_id
FROM   sports.match_history h 
WHERE  NOT EXISTS (
   SELECT match_id 
   FROM   sports.match_results
   WHERE  match_id = h.match_id
   );

More:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228