2

With PostgreSQL call,

CREATE TABLE condor_xrootd AS
       SELECT * FROM condor INNER JOIN xrootd_ext
       ON (
xrootd_ext.CRAB_Id = condor.CRAB_Id AND
REPLACE(condor.CRAB_ReqName, '_', ':') = xrootd_ext.CRAB_ReqName
);

I get the error,

$ psql condor -f ./sql/inner_join.sql 
psql:./sql/inner_join.sql:6: ERROR:  column "crab_id" specified more than once

Which is understandable because each table has a Crab_Id column. I would like to be able to do my inner join without having to specify the columns because I have around 400 columns in the two tables combined.

Please let me know if I can somehow get rid of this error without listing columns individually.

EDIT:

I forgot to mention that speed and stability is crucial here as my join might take several days.

  • 1
    For the join, you can try the USING clause instead of ON when joining on column(s) with the same name, e.g. condor INNER JOIN xrootd_ext USING(CRAB_Id). When using this, the resulting join should only have a single instance of the column(s) being joined on. May or may not work in PostreSQL, but you can give it a try. – DBug Nov 12 '15 at 16:52
  • Take a look at this question: http://stackoverflow.com/questions/15061733/excluding-duplicate-fields-in-a-join I believe it answers your problem. – dmvrtx Nov 12 '15 at 16:54

2 Answers2

2
create table condor_xrootd as
select *
from
    condor
    inner join
    xrootd_ext using (crab_id)
where replace(condor.crab_reqname, '_', ':') = xrootd_ext.crab_reqname
Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260
2

You create table from a result set which is build of two tables. Your error says that column crab_id exists in both condor and xrootd_ext tables and Postgresql doesn't know which should be selected.

You should specify fields of which you want to create table. Like this:

CREATE TABLE condor_xrootd AS
       SELECT
          condor.*, -- here we take all fields from condor table
          xrootd_ext.crab_reqName -- here we take crab_reqName field from xrootd_ext
       FROM condor INNER JOIN xrootd_ext
       ON (
xrootd_ext.CRAB_Id = condor.CRAB_Id AND
REPLACE(condor.CRAB_ReqName, '_', ':') = xrootd_ext.CRAB_ReqName
);
stas.yaranov
  • 1,797
  • 10
  • 17
  • Clodoaldo Neto's answer seems to work and is simpler because I want all the XRootD columns. I am concerned by the WHERE clause. Do you know which is faster and/or can you prove it? –  Nov 12 '15 at 17:01
  • There shouldn't be any performance difference. I think EXPLAIN will prove that. So use what is better for you. – stas.yaranov Nov 12 '15 at 17:08