3

I work for a company that has a DW - ETL setup. I need to write a query that looks for over 2500+ values in an WHEN - IN clause and also over 1000+ values in a WHERE - IN clause. Basically it would look like the following:

SELECT
    ,user_id
    ,CASE WHEN user_id IN ('user_n', +2500 user_[n+1] ) THEN 1
    ELSE 0
    ,item_id
FROM user_table
    WHERE item_id IN ('item_n', +1000 item_[n+1] );

As you probably already know PL/SQL allows a maximum of 1000 values in an IN clause, so I tried adding OR - IN clauses (as suggested in other stackoverflow threads):

SELECT
    ,user_id
    ,CASE WHEN user_id IN ('user_n', +999 user_[n+1] )
     OR user_id IN ('user_n', +999 user_[n+1] )
     OR user_id IN ('user_n', +999 user_[n+1] ) THEN 1
     ELSE 0 END AS user_group
    ,item_id
FROM user_table
    WHERE item_id IN ('item_n', +999 item_[n+1] )
    OR item_id IN ('item_n', +999 item_[n+1] );

NOTE: i know the math is erroneous in the examples above, but you get the point

The problem is that queries have a maximum executing time of 120 minutes and the job is being automatically killed. So I googled what solutions I could find and it seems Temporary Tables could be the solution I'm looking for, but with all honesty none of the examples I found is clear enough on how to include the values I want in the table and also how to use this table in my original query. Not even the ORACLE documentation was of much help.

Another potential problem is that I have limited rights and I've seen other people mention that in their companies they don't have the rights to create temporary tables.

Some of the info I found in my research:

ORACLE documentation

StackOverflow thread

[StackOverflow thread 2]

Another solution I found was using tuples instead, as mentioned in THIS thread (which I haven't tried) because as another user mentions performance seems greatly affected.

Any guidance on how to use a Temporary Table or if anyone has another way of dealing with this limitation would be greatly appreciated.

Community
  • 1
  • 1
nachomasterCR
  • 75
  • 1
  • 10
  • 3
    Where are the values coming from for your `IN` clause? – Tom H Mar 14 '16 at 13:50
  • You should store all this values in a temp table (user_n,item_n) and use IN to there – sagi Mar 14 '16 at 13:51
  • @TomH the values are 'randomly' chosen by another department. i say randomly because only the know why those values are the ones they need, but I can assure you there is no pattern, a BETWEEN for example would not help – nachomasterCR Mar 14 '16 at 13:53
  • Possible duplicate of [How to load a large number of strings to match with oracle database?](http://stackoverflow.com/questions/34699223/how-to-load-a-large-number-of-strings-to-match-with-oracle-database) – MT0 Mar 14 '16 at 14:06
  • How do you receive them from the other department though? Are they the same every time? Are they slowly changing or completely different every time that the process is run? Is someone entering them into a web page or do you receive a file with them? – Tom H Mar 14 '16 at 14:24
  • @TomH yes, they are the same every time. i will run the query once a week but the values will always be the same (unless they change them in the future, which is likely, but not happening soon) and I receive them in an Excel file – nachomasterCR Mar 14 '16 at 14:33
  • @a_horse_with_no_name why do you always edit my title and question and remove the PL/SQL and leave the ORACLE only? (i'm asking just to correct the way i ask questions so there is no need for someone else to edit them) – nachomasterCR Mar 14 '16 at 14:37
  • PL/SQL is only used for stored procedures, functions and triggers. Your questions are all plain SQL no PL/SQL involved. –  Mar 14 '16 at 14:39
  • I won't recreate what others have done below in the answers (especially given my lack of Oracle skills), but if your data very rarely changes then you should just put it into a permanent table and have a process in place to update it (ETL with the spreadsheet perhaps). Then just use any of the queries below, but with this permanent table instead of a temporary table or collection. – Tom H Mar 14 '16 at 14:44
  • This sounds like an XY problem. There is a performance problem but how do we know it's related to the large IN clauses? There is nothing inherently slow about using many thousands of hard-coded values in a SQL statement. And in some cases it outperforms a separate join approach since it's easier to analyze a list of values than a join with another table. It may help to add the explain plan and explain the performance problem more; for example, it's slow for query A but fast for query B, they return this many rows, etc. – Jon Heller Mar 14 '16 at 15:34
  • Yes, @JonHeller, I ran an exaplain plan and I'm getting 2 Cartesian products, that might be the problem!! I'm currently trying to fix this and will update my question as soon as I have a definitive answer! – nachomasterCR Mar 14 '16 at 16:02

5 Answers5

4

Create a global temporary table so no undo logs are created

CREATE GLOBAL TEMPORARY TABLE <table_name> (
<column_name>  <column_data_type>,
<column_name>  <column_data_type>,
<column_name>  <column_data_type>)
ON COMMIT DELETE ROWS;

then depending on how the user list arrives import the data into a holding table and then run

select 'INSERT INTO global_temporary_table <column> values '
|| holding_table.column
||';' 
FROM holding_table.column; 

This gives you insert statements as output which you run to insert the data.

then

SELECT  <some_column>
FROM <some_table>
WHERE <some_value> IN
(SELECT <some_column> from <global_temporary_table>
kevinskio
  • 4,431
  • 1
  • 22
  • 36
  • 'but how do I insert all the 2500+ specific user_id's into the table? wouldnt I have to use again an WHERE - IN clause?' – nachomasterCR Mar 14 '16 at 14:00
  • From wherever you get your list of user_ids. The middle section of this answer is a little confusing. You'll need to insert from a values list or a loop based on that list. – Sigfried Mar 14 '16 at 14:10
  • thanks a lot, will having READ rights only be a limitation to create Temporary Tables? I've read in other threads that it is.. – nachomasterCR Mar 14 '16 at 14:43
2

Use a collection:

CREATE TYPE Ints_Table AS TABLE OF INT;
CREATE TYPE IDs_Table AS TABLE OF CHAR(5);

Something like this:

SELECT user_id,
       CASE WHEN user_id MEMBER OF Ints_Table( 1, 2, 3, /* ... */ 2500 )
            THEN 1
            ELSE 0
            END
      ,item_id
FROM  user_table
WHERE item_id MEMBER OF IDs_table( 'ABSC2', 'DITO9', 'KMKM9', /* ... */  'QD3R5' );

Or you can use PL/SQL to populate a collection:

VARIABLE cur REFCURSOR;

DECLARE
  t_users Ints_Table;
  t_items IDs_Table;

  f       UTL_FILE.FILE_TYPE;
  line    VARCHAR2(4000);
BEGIN
  t_users.EXTEND( 2500 );
  FOR i = 1 .. 2500 LOOP
    t_users( t_users.COUNT ) := i;
  END LOOP;

  // load data from a file
  f := UTL_FILE.FOPEN('DIRECTORY_HANDLE','datafile.txt','R');
  IF UTL_FILE.IS_OPEN(f) THEN
  LOOP
    UTL_FILE.GET_LINE(f,line);
    IF line IS NULL THEN EXIT; END IF;
    t_items.EXTEND;
    t_items( t_items.COUNT ) := line;
  END LOOP;

  OPEN :cur FOR
    SELECT user_id,
           CASE WHEN user_id MEMBER OF t_users
                THEN 1
                ELSE 0
                END
          ,item_id
    FROM  user_table
    WHERE item_id MEMBER OF t_items;
END;
/

PRINT cur;

Or if you are using another language to call the query then you could pass the collections as a bind value (as shown here).

Community
  • 1
  • 1
MT0
  • 143,790
  • 11
  • 59
  • 117
  • i just learned about collections but they might not be useful in my case and you probably suggested it because of the way i presented the data which creates confusion. the list im working with does not follow an (n+1) pattern, there actually is no patter, its a list of alphanumeric values of 5 digits which are randomly ordered. So the list could be ABSC2, DITO9, KMKM9, etc.. it this still a use case for collections? – nachomasterCR Mar 14 '16 at 14:40
  • Yes, see my update. You could load the values from a file (see the middle section) or pass it in as a bind variable if they are being declared in an external language (see the end link to a java example). – MT0 Mar 14 '16 at 14:58
1

In PL/SQL you could use a collection type. You could create your own like this:

create type string_table is table of varchar2(100);

Or use an existing type such as SYS.DBMS_DEBUG_VC2COLL which is a table of VARCHAR2(1000).

Now you can declare a collection of this type for each of your lists, populate it, and use it in the query - something like this:

declare
   strings1 SYS.DBMS_DEBUG_VC2COLL := SYS.DBMS_DEBUG_VC2COLL();
   strings2 SYS.DBMS_DEBUG_VC2COLL := SYS.DBMS_DEBUG_VC2COLL();

   procedure add_string1 (p_string varchar2) is
   begin
      strings1.extend();
      strings1(strings.count) := p_string;
   end;

   procedure add_string2 (p_string varchar2) is
   begin
      strings2.extend();
      strings2(strings2.count) := p_string;
   end;
begin
   add_string1('1');
   add_string1('2');
   add_string1('3');
   -- and so on...
   add_string1('2500');

   add_string2('1');
   add_string2('2');
   add_string2('3');
   -- and so on...
   add_string2('1400');

   for r in (
    select user_id
         , case when user_id in table(strings2) then 1 else 0 end as indicator
         , item_id
      from user_table
     where item_id in table(strings1)
    )
   loop
      dbms_output.put_Line(r.user_id||' '||r.indicator);
   end loop;
end;
/
Tony Andrews
  • 129,880
  • 21
  • 220
  • 259
1

You can use below example to understand Global temporary tables and the type of GTT.

CREATE GLOBAL TEMPORARY TABLE GTT_PRESERVE_ROWS (ID NUMBER) ON COMMIT PRESERVE ROWS;
INSERT INTO GTT_PRESERVE_ROWS VALUES (1);
COMMIT;
SELECT * FROM GTT_PRESERVE_ROWS;
DELETE FROM GTT_PRESERVE_ROWS;
COMMIT;
TRUNCATE TABLE GTT_PRESERVE_ROWS;
DROP TABLE GTT_PRESERVE_ROWS;--WONT WORK IF YOU DIDNOT TRUNCATE THE TABLE OR THE TABLE IS BEING USED IN SOME OTHER SESSION

CREATE GLOBAL TEMPORARY TABLE GTT_DELETE_ROWS (ID NUMBER) ON COMMIT DELETE ROWS;
INSERT INTO GTT_DELETE_ROWS VALUES (1);
SELECT * FROM GTT_DELETE_ROWS;
COMMIT;
SELECT * FROM GTT_DELETE_ROWS;
DROP TABLE GTT_DELETE_ROWS;

However as you mentioned you receive the input in an excel file so you can simply create a table and load data in that table. Once the data is loaded you can use the data in IN clause of your query.

select * from employee where empid in (select empid from temptable);
Sandeep
  • 774
  • 3
  • 8
0
 create temporary table userids (userid int);
 insert into userids(...)

then a join or in subquery

 select ...
 where user_id in (select userid from userids);
 drop temporary table userids;
Sigfried
  • 2,943
  • 3
  • 31
  • 43