How to turn 2 queries with common columns (A, B) and (A, C) into just one (A, B, C)?

Question

I currently have 2 queries that return

PRODUCER                       FirstQueryColumn       
------------------------------ ---------------------- 
aaaaaaaaaaaa                   1                      
bbbbbbbbbbb                    1                      

PRODUCER                       SecondQueryColumn      
------------------------------ ---------------------- 
aaaaaaaaaaaa                   2                      
bbbbbbbbbbb                    1

What I'd like to know is how should I go about making it such that I can have in a single query the same data, that is, I want something that'll yield (Producer, FirstQueryColumn, SecondQueryColumn).

How can I do that?

Here are my current queries:

select Producers.name Prod, count(Animals.idanimal) AnimalsBought
from AnimalsBought, Animals, Producers
where (AnimalsBought.idanimal = Animals.idanimal) and (Animals.owner = Producers.nif) group by Producers.name;

select Producers.name Prod, count(Animals.idanimal) AnimalsExploration
from AnimalsExploration, Animals, Producers
where (AnimalsExploration.idanimal = Animals.idanimal) and (Animals.owner = Producers.nif) group by Producers.name;

As you can see, for this case a join will not do much:

select Producers.name Prod, count(AnimalsBought.idanimal) AnimalsBought, count(AnimalsExploration.idanimal) AnimalsExploration
from Producers, Animals, AnimalsBought, AnimalsExploration
where (AnimalsExploration.idanimal = Animals.idanimal) and (Animals.owner = Producers.nif) group by Producers.name;

or am I doing something wrong?

Tony Andrews · Answer 1 · 2010-12-03T10:25:13.967

5

First imagine the 2 queries were just tables. You would do this:

select a.producer, a.firstquerycolumn, b.secondquerycolumn
from table1 a
join table2 b on b.producer = a.producer

You can replace each table by a query (known as an in-line view):

select a.Prod, a.AnimalsBought, b.AnimalsExploration
from
( select Producers.name Prod, count(Animals.idanimal) AnimalsBought
  from AnimalsBought, Animals, Producers
  where (AnimalsBought.idanimal = Animals.idanimal) 
  and (Animals.owner = Producers.nif) 
  group by Producers.name
) a
join
( select Producers.name Prod, count(Animals.idanimal) AnimalsExploration
  from AnimalsExploration, Animals, Producers
  where (AnimalsExploration.idanimal = Animals.idanimal) 
  and (Animals.owner = Producers.nif)
  group by Producers.name
) b
on a.Prod = b.Prod;

You may need to change my "join" to "full outer join" if one query may return data for a producer where the other does not. I would also be inclined to restructure the query as follows, making a main query on Producers outer joined to the 2 subqueries (with Producers removed):

select Producers.name Prod, a.AnimalsBought, b.AnimalsExploration
from Producers
left outer join ( select Animals.owner, count(AnimalsBought.idanimal) AnimalsBought
                    from AnimalsBought, Animals
                   where AnimalsBought.idanimal = Animals.idanimal
                   group by Animals.owner
                ) a
           on a.owner = Producers.nif
left outer join ( select Animals.owner, count(Animals.idanimal) AnimalsExploration
                    from AnimalsExploration, Animals
                   where AnimalsExploration.idanimal = Animals.idanimal
                   group by Animals.owner
                ) b
           on b.owner = Producers.nif;

(It is this type of query that I tested the performance of below).

Rather than bloat this answer with information probably not of interest to the OP, my notes on the relative performance of scalar subqueries and inline views in Oracle (requested by PerformanceDBA) are now offline here: Notes on Performance

edited Dec 03 '10 at 10:25

answered Nov 24 '10 at 09:31

Tony Andrews

129,880
21
220
259

1

@Tony: wouldn't that create two materialised tables/in-line views, with the create, populate, destroy overhead ? Whereas, the query can be obtained without same. Oracle can perform ANSI subqueries ? – PerformanceDBA Nov 25 '10 at 17:15
There is no create/populate/destroy for in-line views in Oracle. The in-line view is treated exactly as a normal view, i.e. the query is performed and the results processed by the next step. – Tony Andrews Nov 26 '10 at 07:37
1

@Tony. Ok, I can accept that. But then Oracle (yet again) encourages coders to jump through hoops (your answer) to get what is a straight-forward in SQL (my answer). If performance is sought, then it is required, not merely encouraged. Good for ensuring customers can't migrate to the competition. – PerformanceDBA Nov 26 '10 at 08:12
If you say so. To my eyes, this is a more "set-based" approach, the scalar subquery approach in your answer is more "row-by-row". Not that I don't use scalar subqueries, they can be very useful. The original question was "I have these 2 queries, how can I combine them", which is why I answered as I did. – Tony Andrews Nov 26 '10 at 08:34
1

@Tony. Sure. I suppose the question can be answered two ways: literally, by your post; vs helping OP understand sets better (I would not call it "scalar") as per my Answer, which is much simpler, precisely because it is a more set-based approach; understanding his data and how to get at it. Yours is in fact two separate processing phases, with two sets in the first, being fed into the second; no change to OP's simple sets or understanding of Rdb. Mine is a single phase. – PerformanceDBA Nov 26 '10 at 08:50
Well of course, a SQL query in itself doesn't have "processing phases", and each DBMS may actually perform the query in very different ways. Yours may be a single phase on Sybase (I don't know) but it certainly isn't for Oracle - for 1000 Producers it will actually perform each scalar subquery 1000 times. What would Sybase do with yours? (The term "scalar subquery" is Oracle's, not mine by the way). – Tony Andrews Nov 26 '10 at 09:15
The term "scalar" subquery would appear to be an ANSI standard in fact: http://publib.boulder.ibm.com/infocenter/rbhelp/v6r3/index.jsp?topic=%2Fcom.ibm.redbrick.doc6.3%2Fsqlrg%2Fsqlrg44.htm – Tony Andrews Nov 26 '10 at 09:18
@Tony: **You** stated in comment 3 that *the query is performed and the results processed by the next step*, you can label that whatever you like. Scalar: well, even according to your link, it is not a scalar subquery, it is a table subquery. 1000 Producers, 1000 Subqueries: We fixed that in 2001. But as you know, our architecture is different, it is done as processes within the engine; yours is handled by the o/s, much less internal control. But are not your 2 feeder sets table subqueries as well (called in-line by Oracle), tables that are constituents of the `FROM` clause ? – PerformanceDBA Nov 26 '10 at 13:36
Yes, I did, but I was speaking conceptually (and sloppily no doubt). All I meant was that the query is treated *logically* as if it were a normal view; the optimizer *may* perform it alone and feed the results into the next step, or it *may* merge the inline view definition into the outer query. SQL is declarative as I imagine you are well aware, a SQL statement does not specify a particular access method (though it may influence it due to imperferctions in the optimizer). – Tony Andrews Nov 26 '10 at 14:38
Your subqueries **are** scalar because they return 1 value, not a row of values or a set of rows of values. They are definitely called scalar subqueries in Oracle, and I am sure they are called the same elsewhere even if you don't know it. – Tony Andrews Nov 26 '10 at 14:41
1000 Producers, 1000 subqueries: good, I don't deny that it could and should be fixed (and I must check up that what I said still holds true in Oracle 11.2; I haven't heard that it does not). – Tony Andrews Nov 26 '10 at 14:42
Yes my 2 feeder sets are subqueries, not sure what point you are making? – Tony Andrews Nov 26 '10 at 14:44
Having looked into scalar subqueries further I don't think Oracle has improved its optimisation of them in 11G; there is (and has been) something called scalar subquery caching that is helpful if a scalar subquery is invoked more than once with the same inputs (so not applicable here). I also see that Sybase has an optimisation technique called "scalar subquery flattening" - is that the 2001 fix you meant? Apparently it works by turning them into "equivalent join queries". Good for them. – Tony Andrews Nov 26 '10 at 16:10
@Tony. Your *two feeder subqueries* are table subqueries. Oracle performs differently, depending on where they are located. It should not, the Standard SQL (if anything) location should be the faster. Here, with your internals knowledge, you have located them elsewhere. That's all. You're right, mine are scalars (re precision rather concept as well). They are millions of scalars that make up the set that OP was visualising, expressed in a clear standard relational manner. – PerformanceDBA Nov 27 '10 at 01:48
@Tony. Scalar subquery caching: We do that inside parallelism. Flattening: No, that's similar to your answer, where scalar subqueries incorrectly identified as materialised views in the FROM clause can (and should) be changed to operate in the WHERE [or Join] clause; as per other Answers, not at all the scalars in my Answer. The 12.0 fix in 2001 was an overall improvement to scalar and table subqueries in general, when we improved parallelsim (producer vs consumer processes). The 15.0 (2007) and 15.5 (2009) releases further improve parallelism. – PerformanceDBA Nov 27 '10 at 01:51
@PerfDBA, I totally agree that I should not have to structure my SQL using Oracle internals knowledge, that is a limitation of Oracle's scalar subqueries, no argument there. However, I would quesiton your assertion that your answer is Standard SQL and mine is not. I believe ANSI SQL:92 allows inline views does it not? In fact it could be argued that you don't use them because your knowledge of Sybase internals informs you that they don't perform so well in Sybase... – Tony Andrews Nov 27 '10 at 09:21
@Tony. Come on, I did not say in-line was illegal. I said here, scalar was simpler, set-processing; even you described yours as feeding two sets into a third, far more code; not the single relational set-processing model. Sybase just makes sure the standard is not slower than the materialised view, and handles parallelism differently for each type. it is a confidence that users can have, without having to know internals. – PerformanceDBA Nov 27 '10 at 23:47
I didn't say you said mine was "illegal", I said that you said (and you say it again now) that your answer is "the Standard SQL" which surely implies that mine is not, or else the remark pointless. Your re-assertion that mine is not relational and not set-processing is false as I'm sure you well know. I have 2 sets, and I join them. Codd and Date would surely approve even if you do not. I do not deny that yours is simpler, I merely deny that mine is *wrong* – Tony Andrews Nov 28 '10 at 10:55
@Tony. Well, we are getting into semantics now. Let's say that Right and Wrong are two ends of a spectrum. I cannot imagine that code that is (a) twice as long (b) handles 3 sets instead of 1 and (c) introduces 2 `GROUP BYs`, can possibly be right. Or in the middle. It is allowed, legal, but it cannot be faster. The method according to the RM (standard) has got to be closer to the Right. You figure out where your code goes, but it is not near the Right. – PerformanceDBA Nov 28 '10 at 15:57
I don't know why "getting into semantics" is to be avoided: yes, I am trying to understand what you mean when you say your code is "Standard SQL" presumably implying mine is not. Now here you go again: "the method [i.e. your code] according to the RM (standard)". If my code is "not near the Right" then it is somehow "near the Wrong", but only because you say so, not due to any SQL or RM Standard. Also, quite how you can claim your SQL has only 1 set to my 3 when it has 2 subqueries in it defies understanding. Semantics again I suppose; well semantics matter. – Tony Andrews Nov 28 '10 at 18:39
@Tony. Updated my answer. The SQL Std identifies language, not logic. The RM identifies sets, not language. You took my reference to "standard" literally, so I have been modulating it as the conv progressed. I agree, there is no standard for logic or clear thinking. A logical person, aware of the RM, who thinks in sets, using SQL language in its simple form, without hindrance of Oracles slow/fast methods would construct a query with 1 set (incl 2 scalar subqueries), seeking shorter code. Yours is 1 set (a) OP original; 1 set (b) OP original; feeding into 1 set for outer result: total 3 sets. – PerformanceDBA Nov 28 '10 at 22:08
It would help if your "facts" were accurate. Since they are not there is no point in continuing to argue with you. Over and out. – Tony Andrews Nov 28 '10 at 23:07
So, no, yours is not logical, not set-oriented in the problem solving sense, and (aside from Oracle) it is inefficient, in the SQL or programming sense. You take OP literally, not solving his understanding of sets (RM). Not what I would provide as an example to solve the problem. I answered his *am I doing something wrong?*: yes, he is, literally binding 2 simple sets rather than thinking about the new set that he needs. Since we are arguing, I am limiting myself to stating facts. Whatever implications you take from my statements are your business. – PerformanceDBA Nov 28 '10 at 23:59
I checked my facts. One mistake. The 2 x `GROUP BYs` were in OPs two original sets, you did not add them. I cannot see anything else I have stated that is not verifiable. – PerformanceDBA Nov 29 '10 at 00:03
@Tony. Thanks for the test results. Shake hands. – PerformanceDBA Nov 30 '10 at 22:24
@Tony. Please post 2 x other table populations and system details. Ok, I ran tests using SUM() and posted the results. Something is bothering me, though, and I have to check it out: there may be more to it. – PerformanceDBA Dec 01 '10 at 02:14
@PerformanceDBA: post updated with DDL and some system details. – Tony Andrews Dec 03 '10 at 10:01
@Tony: Thanks. 1) You used 2 different tables and I used 1 table twice; our scalar data points were different. I repopulated my tables to match yours & re-ran the tests. Timings practically identical. 2) I notice the DDL we use is quite different 3) I think you are missing a critical Index, thus Oracle is table-scanning (while Sybase was index-scanning). I suggest you correct that & re-run your test; I trust you will obtain better figures. – PerformanceDBA Dec 04 '10 at 09:52
@PerformanceDBA: I don't have access to the email you sent me at the moment; I will return to this when I do. – Tony Andrews Dec 06 '10 at 12:23

score 4 · Answer 2 · answered Nov 24 '10 at 09:31

4

select

tab1.producer
tab1.cola
tab2.colb

from
     (query a) tab1
inner join
     (query b) tab2
on
      tab1.producer = tab2.producer

You may want to change the join to a full outer join if every producer doesn't exist in each query.

answered Nov 24 '10 at 09:31

JamieDainton

155
1
6

Andrey Balaguta · Accepted Answer · 2010-11-24T13:11:55.307

3

I suppose animals.idanimal is a primary key. If it is so, one could write the query using left outer join and count on a target column to cut off NULLs.

select producers.name prod,
       count(animalsbought.idanimal) animalsbought,
       count(animalsexploration.idanimal) animalsexploration
from producers
  join animals on  animals.owner = producers.nif
  left join animalsbought on animalsbought.idanimal = animals.idanimal
  left join animalsexploration on animalsexploration.idanimal = animals.idanimal
group by producers.name;

edited Nov 24 '10 at 13:11

answered Nov 24 '10 at 11:58

Andrey Balaguta

1,308
2
21
28

1

I was busy typing up just the same query. Don't think you need the distinct though since nulls are ignored by `count(column)`. The only time nulls are not ignored is by `count(*)` – Mike Meyers Nov 24 '10 at 12:06

PerformanceDBA · Answer 4 · 2010-12-06T05:43:23.330

Revised 28 Nov 2010

or am I doing something wrong?

Yes. You are aware of two simple sets that you know, and you are focussed on producing a third set from those two sets.

Taking a Relational approach (The Relational Model; Codd & Date) to the problem means thinking it terms of sets. Both your original sets are projections of data in the database. So rather than combining them, the results, as is, into a third set, you need to think only of a new set, a new projection directly from the data in the database.
Second, after you do that, taking a straight-forward approach to the SQL language to obtain the result, is easy. If the (1) is understood, the programming required is a simple, logic problem. Sure, the ISO/IEC/ANSI SQL Standard permits various code constructs, and there are many ways to obtain the same result; what I am talking about is program logic; simplicity; efficiency; rather than not.

This will execute faster on any ISO/IEC/ANSI SQL-compliant platform.
- It is well-known that Oracle is not compliant, or stretches the definitions of the Standards a great deal, in order to appear to be compliant, so I accept this code may not execute faster on Oracle. Specifically, Oracle people appear to know that scalar subqueries (which logic and simplicity demand here) do not perform well, and they use other constructs instead.
- Any such alternate construct is a departure from (1) and (2). (Eg. Tony Andrews' answer takes OP's question literally, uses the two known sets; joins them; feeds them into a 3rd set; then extracts 3 columns from that.)

Try this, it is the set-oriented method of manipulating Relational databases, producing the new single set directly from the data (1), without regard to the two known sets. It then uses the SQL language in a straight-forward, simple, logical manner (2) to obtain that requirement. Yes, that happens to be two scalar subqueries (returns a single value):

SELECT  Producers.name Prod, 
        (SELECT COUNT(Animals.idanimal)
            FROM Animals,
                 AnimalsBought
            WHERE p.nif = Animals.owner
            AND   Animals.idanimal = AnimalsBought.idanimal
            ) AnimalsBought,
        (SELECT COUNT(Animals.idanimal)
            FROM Animals,
                 AnimalsExploration
            WHERE p.nif = Animals.owner
            AND   Animals.idanimal = AnimalsExploration.idanimal
            ) AnimalsExploration
    FROM Producers p

Comparison 29 Nov 2010

This section is only relevant for people interested the testing Tony and I are involved in, and the conclusions thereof

It bothered me that Tony reported Oracle executes his query in 2 secs vs my query in 5 secs. I could not reconcile myself with the idea that code that is twice as long; that uses 3 times the number of sets, could run faster.

I ran a test on Sybase. I already have a benchmark structure available, which would allow very similar code to be tested (the test on my tables did not need to join two tables but I included that, so that it is as close to OP's question as possible). 2.0M CustomerTransactions (Animals) spread across 100 Customers (Producers). That is, both queries are producing the same result set (100 Customers (Producers) by 3 columns) from 2.0M data points or 100 x 20,000 scalars. Let's identify them accurately, so that we can understand and compare them better. First the DDL for the structures used:

Link to DDL

1_Set_Scalar_Subquery

An exact equivalent of my code above, rendered for the available benchmark structures:

From OP's description, the structure of the requirement (1) is a list of Customers (Producers).
That is easily translated (2) into a simple query: SELECT ... FROM Customer (Producer)
Next we need two more columns, specific Things (1) re Customer (Producer), which is data that can be derived (not readily available as physical columns) from the database
This is easily done (2) with two scalar subquery queries

Link to 1_Set_Scalar_Subquery Code & Timings

1083, 1073, 1103, 1073, 1080 ms
Sybase Stats show 3 scans:
Customer (Producer) scanned once
CustomerTransaction (Animal) scanned twice (once for each scalar subquery)

3_Set_Inline_Query

An exact equivalent of Tony's code, rendered for the available benchmark structures:

As I understand it, from Tony's comments, he has taken OP literally; placed the two known sets in the FROM clause; joined them; producing a 3rd set (the 3rd outerSELECT) which is exposed
- While I was typing the code in, I realised the 1_Set_Scalar_Subquery is far more efficient in an additional aspect, because it does not have the 2 x GROUP BYs that are present in the 2 original sets (ie. the original sets were inefficient to begin with; and can be improved)

Link to 3_Set_Inline_Query Code & Timings

1820, 1850, 1846, 1843, 1850 ms
That is 70% slower.
Sybase Stats show 9 scans:
Customer (Producer) scanned twice (outer query) CustomerTransaction (Animal) scanned twice (the 2 inline sets)
Worktable 2 scanned 3 times (the merged set)
Worktables 1 & 3 scanned once each (for the GROUP BYs)

Of course the 70% difference will be exaggerated on a busy server.

3_Set_Inline_Query_Improved

Since the original sets provided by OP was inefficient, ie. it could be easily improved, the 2 x GROUP BYs which require worktables could be removed, etc; to produce something that compares more reasonably, while remaining inline, versus the scalar subquery, I did. Of course, that means, instead of manipulating a large joined set and then GROUP BY to obtain aggregates; to use (1) and (2) manipulate just the required set, and fill the aggregates in using scalar subqueries.

The purpose of this was to test the veracity of my statement that Sybase performs both scalar subqueries and inline queries with the same efficiency. At least now we are comparing two good code segments each of a completely different structure, instead of one good segment code against one poor code segments.

Link to 3_Set_Inline_Query_Improved Code & Timings

1103, 1073, 1103, 1073, 1080 ms
That is practically identical to 1_Set_Scalar_Subquery
Sybase stats show 4 scans:
Customer (Producer) scanned once
CustomerTransaction (Animal) scanned twice (once for each scalar subquery; even though there are 2 inline sets)
Customer (Producer) scanned once more (for the outermost query)
Therefore it is proved (in this simple test), that Sybase performs the Inline Queries exactly the same as Scalar Subqueries. Given that in this case, it is 3 sets vs 1 set, I would say that the Optimiser has done a great job in Normalising the query (not flattening as Tony inquired, and I responded), in order to produce identical performance results.
The other thing this exercise has proved is, the importance of taking the (1), (2) approach. Assuming even reasonable Normalisation, data in Relational database is best approached with a Relational, set-oriented mindframe. That leads to simpler, more efficient code. Of course, smaller, simpler, normalised code performs better.

Except on Oracle, where due to its peculiarities and limitations, people have to avoid that, and use constructs that are known not to perform badly.

Comparison 01 Dec 2010

Tony posted the results of his tests. They speak for themselves, case closed.

However he introduced the issue issue that although in this particular case (this question) Oracle executes the 1_Set_Scalar_Subquery almost twice as fast as the 3_Set_Inline_Query, because the COUNT() was serviced from the index without accessing the table, that in general Oracle performs Inline queries much better than Scalar Subqueries, and tested SUM() in order to examine the effect of visiting the tables.

I have therefore run the same test in Sybase, with COUNT() changed to SUM(). As with Oracle, the COUNT() was a covered query, but the SUM() now demands visiting the tables.

(Conclusions later, when Tony has posting some missing details. One thing that is bothering me is, Sybase deals with 2.0M rows in under 2 seconds; Oracle even at its best, does not appear to be able to deal with the same work with 1000++ rows in 6 seconds (table populations pending) .)
Tony has posted some missing details, but also opens new issues which I have identified in my comments on his post. Therefore it would still be a bit premature to draw definitive conclusions
So far, in similar tests (not close enough for my liking), on similar systems (ok Sybase system has 2 additional s/w layers and is not a server):
- Sybase deals with COUNT 4.0M rows in 1.1 and 1.5 seconds;
  Oracle does similar work in 3.5 and 5.7 seconds.
- Sybase does the SUM in 2.1 and 3.0 secs;
  Oracle did not complete the subquery and did the inline in 79 secs
  (further info requests pending).
- In the best case (Subquery did not complete, so we can't count that), using poor code as demanded for Oracle, Oracle is 26 times slower than Sybase.
- And 40 times slower where the Sybase coder used Relational set theory.

1_Set_Scalar_Subquery_SUM

Same as 1_Set_Scalar_Subquery_SUM, with COUNT() changed to SUM().

Link to 1_Set_Scalar_Subquery_SUM Code & Timings

2173, 2153, 2163, 2153, 2173 ms
Sybase Stats show 3 scans:
Customer (Producer) scanned once
CustomerTransaction (Animal) scanned twice (once for each scalar subquery)

3_Set_Inline_Query_SUM

Same as 3_Set_Inline_Query, with COUNT() changed to SUM().

Link to 3_Set_Inline_Query_SUM Code & Timings

3033, 2993, 3033, 2993, 3013 ms
That is 38% slower.
Sybase Stats show 9 scans:
Customer (Producer) scanned twice (outer query) CustomerTransaction (Animal) scanned twice (the 2 inline sets)
Worktable 2 scanned 3 times (the merged set)
Worktables 1 & 3 scanned once each (for the GROUP BYs)

Of course the 38% difference will be exaggerated on a busy server.

3_Set_Inline_Query_Improved_SUM

Same as 3_Set_Inline_Query_Improved, with COUNT() changed to SUM().

Link to 3_Set_Inline_Query_Improved_SUM Code & Timings

2183, 2180, 2183, 2143, 2143 ms
That is practically identical to 1_Set_Scalar_Subquery
Sybase stats show 4 scans:
Customer (Producer) scanned once
CustomerTransaction (Animal) scanned twice (once for each scalar subquery; even though there are 2 inline sets)
Customer (Producer) scanned once more (for the outermost query)
One thing within the Sybase context. My assertion that the Sybase Optimiser Normalises queries, that therefore it handles Scalar Subqueries as well as it handles Inline Views, is further proved. Poor code (3_Set_Inline_Query unimproved) runs slower, but only proportionately slower, no dramatic difference.

On Sybase maybe, PerformanceDBA, but not on Oracle. I set up a reasonably sized test case and your query took 5 seconds (elapsed) whereas an equivalent query with in-line views and outer joins took about 2 seconds. It does a lot more work. — Tony Andrews, Nov 26 '10 at 08:29
@Tony. The -1 is unfair, because this is clean ANSI standard SQL; that Oracle can't process it efficiently, and therefore requires complicating the SQL requirement, is a separate matter. — PerformanceDBA, Nov 26 '10 at 08:36
I gave you the -1 because your answer is misleading (for an Oracle-tagged question) saying "it will perform better..." when it does not. If you retract that claim from the otherwise perfectly good answer I will remove my downvote. — Tony Andrews, Nov 26 '10 at 08:36
"This will execute faster on any ISO/IEC/ANSI SQL-compliant platform" is an unsubstantiated claim. It executes faster on Sybase is all you know, but it is merely your *belief* that it would execute faster on other such systems. — Tony Andrews, Nov 29 '10 at 09:04
Your constant (and often misinformed) carping about Oracle is amusing: you remined me of the people who kept complaining that Betamax was better than VHS; maybe it was, but it died... — Tony Andrews, Nov 29 '10 at 09:05
@Tony. *This will execute faster on any ISO/IEC/ANSI SQL-compliant platform* I thought you already knew I have extensive exp on MS and somewhat less on DB2, much more than belief. A couple of hundred instances correcting SQL (scalar/table subqueries, inline) on those platforms. That's how I know what is standard/not. *no relevance to OP* I beg to differ, I answered his question at a deeper level; the test was in reference to your sttmts. Oracle is orable: check the last post in [this thread](http://www.forumtopics.com/busobj/viewtopic.php?p=442226). Hundreds more where that came from. — PerformanceDBA, Nov 29 '10 at 12:06
Fair enough. But whether Oracle is orable or not, it is what the OP is using and so it is Oracle performance that he is concerned with primarily is it not? — Tony Andrews, Nov 29 '10 at 12:10
"Oracle even at its best, does not appear to be able to deal with the same work with 1000++ rows in 6 seconds". You will now see that in fact I was working with 4 million rows - 1000 rows in the master table, 2M each in the two detail tables. — Tony Andrews, Dec 03 '10 at 10:04
@Tony: Come on, It said "Table populations pending". Anyway, content fixed up. Awaiting you response re re-running the test, before finalising conclusions. — PerformanceDBA, Dec 06 '10 at 06:25
@PerfDBA: I know, I only quoted you to give my update some context. — Tony Andrews, Dec 06 '10 at 09:41
Please avoid nested SQL queries, they are a nightmare to query optimizer and usually comes with heavy performance panelty — John, Dec 13 '10 at 12:24
@John. What exactly is your experience, and what platform (other than Oracle) do you suggest they have a performance penalty ? I appreciate you may have problems with "nested" queries, but Sybase, DB2, and MS, optimise and execute them well. — PerformanceDBA, Dec 16 '10 at 09:51

score 1 · Answer 5 · answered Nov 29 '10 at 04:12

I thought this should be a separate Answer, since it has to do with your original queries, and not your question as stated. I am taking up your or am I doing something wrong? at a different point.

Devoured, as a result of my exercise with Tony, which required inspection of your two original sets, it became clear that they were inefficient, and could be improved, as per (1), (2) in my previous answer. Consideration of, and attendance to, Oracle hindrances that you need to work around, interferes with and inhibits a set-oriented approach to solving queries.

So we will set that aside for a moment and take the (1), (2) approach. Stated another way, if you start with bad apples, and you simply extend them (which is your specific question), the badness is quadrupled, as proved in the tests.
I am setting aside the projection of the large result set, and then GROUPingit, as you have. Because you want a list of Producers ++, I am using that as the basic structure of the query. Then I am filling in the aggregates with scalar subqueries. The GROUP BY and the worktables required to process that, are eliminated.
Another way of stating it is, I've Normalised your original queries.
I am assuming you need the join to AnimalsBought for an existence check; otherwise (since it is not referenced in the COUNT()), it can be removed.

I suspect it will be faster than your original queries, even though I have heard that Oracle does not process scalar subqueries well, because this avoids processing the large set; but I cannot test this on Oracle.

SELECT  Producers.name Prod, 
        ( SELECT count(Animals.idanimal)
              FROM  Animals,
                    AnimalsBought
              WHERE (p.nif = Animals.owner)
              AND   (Animals.idanimal = AnimalsBought.idanimal) 
              ) AnimalsBought
    FROM  Producers p;

SELECT  Producers.name Prod, 
        ( SELECT count(Animals.idanimal)
              FROM  Animals,
                    AnimalsExploration
              WHERE (p.nif = Animals.owner)
              AND   (Animals.idanimal = AnimalsExploration.idanimal) 
              ) AnimalsExploration
    FROM  Producers p;

In the event that that works for you, that would naturally lead to the Answer to your stated Question:

SELECT  Producers.name Prod, 
        ( SELECT count(Animals.idanimal)
              FROM  Animals,
                    AnimalsBought
              WHERE (p.nif = Animals.owner)
              AND   (Animals.idanimal = AnimalsBought.idanimal) 
              ) AnimalsBought,
        ( SELECT count(Animals.idanimal)
              FROM  Animals,
                    AnimalsExploration
              WHERE (p.nif = Animals.owner)
              AND   (Animals.idanimal = AnimalsExploration.idanimal) 
              ) AnimalsExploration
    FROM  Producers p;

If you are plagued by Nulls and Spinsters, then mcha's answer is best.

score 0 · Answer 6 · edited Dec 06 '11 at 02:26

0

Without going into details of queries, I think following may help.

Select A.Producer, A.FirstQueryColumn, B.SecondQueryColumn
From
(
   Select Producer, FirstQueryColumn, '' As SecondQueryColumn
   From TableA
) AS A
Inner Join
(
   Select Producer, '' as FirstQueryColumn, SecondQueryColumn
   From TableA
) AS B ON A.Producer = B.Producer

edited Dec 06 '11 at 02:26

Adam Wenger

17,100
6
52
63

answered Nov 24 '10 at 09:38

Kashif

1

score 0 · Answer 7 · answered Nov 24 '10 at 09:52

0

SELECT A.producer, COUNT(A.firstquerycolumn), COUNT(B.secondquerycolumn) FROM 
TAB1 A, TAB2 B
WHERE A.producer = B.producer 
GROUP BY A.firstquerycolumn, B.secondquerycolumn

That's what you're asking for?

answered Nov 24 '10 at 09:52

PPShein

13,309
42
142
227

score 0 · Answer 8 · answered Nov 24 '10 at 10:08

What i understood, you want a query which will show you :

PRODUCER            FirstQueryColumn       SecondQueryColumn      
------------------------------------------------------------
aaaaaaaaaaaa         1                      2
bbbbbbbbbbb          1                      1

you can try something like :

SELECT (SELECT Producers.name Prod
          FROM AnimalsBought, Animals, Producers
         WHERE (AnimalsBought.idanimal = Animals.idanimal)
           AND (Animals.owner = Producers.nif)
         GROUP BY Producers.name) as Producers,
       (SELECT COUNT(Animals.idanimal) AnimalsBought
          FROM AnimalsBought, Animals, Producers
         WHERE (AnimalsBought.idanimal = Animals.idanimal)
           AND (Animals.owner = Producers.nif)
         GROUP BY Producers.name) as firstQuery,
       (SELECT COUNT(Animals.idanimal) AnimalsExploration
          FROM AnimalsExploration, Animals, Producers
         WHERE (AnimalsExploration.idanimal = Animals.idanimal)
           AND (Animals.owner = Producers.nif)
         GROUP BY Producers.name) as secondQuery
  FROM DUAL

of course you can optimise it, its just an idea :)

score 0 · Answer 9 · answered Nov 24 '10 at 19:09

do you want to union the tables? (put tables on top of each other)
or do you want to join the tables? (put tables side by side)

Both union and join can produce the output format you wanted but are very different things.

Since their columns don't match you will need to plug in null place holders to union.

How to turn 2 queries with common columns (A, B) and (A, C) into just one (A, B, C)?

9 Answers9

Revised 28 Nov 2010

Comparison 29 Nov 2010

Comparison 01 Dec 2010

Linked