1

I would like to get a column_name and table_name for a list from all_tab_columns (which is not a problem till here) and also for each given column I want to go to the original table/column and see what is the top value with highest occurence.

With the query below I get the desired value for 1 example of column in 1 table:

select   col1
from    (SELECT col1, rank () over (order by count(*) desc) as rnk
         from  T1
         Group by col1
         ) 
 where   rnk = 1

now I want something like this:

select  table_name,
        column_name,
        xmlquery('/ROWSET/ROW/C/text()'
            passing xmltype(dbms_xmlgen.getxml( 'select ' || column_name  || ' from (select ' || column_name ||', rank () over (order by count(*) desc) as rnk from ' 
            || table_name || ' Group by ' || column_name || ') where rnk = 1;'))
            returning content) as C
from all_tab_columns
where owner = 'S1'
and table_name in ('T1', 'T2', 'T3', 'T4')
;

but it does not work. This is the error I get:

ORA-19202: Error occurred in XML processing
ORA-00933: SQL command not properly ended
ORA-06512: at "SYS.DBMS_XMLGEN", line 176
ORA-06512: at line 1
19202. 00000 -  "Error occurred in XML processing%s"
*Cause:    An error occurred when processing the XML function
*Action:   Check the given error message and fix the appropriate problem

I make an example. These are my two tables, for instance; T1:

col.1      col.2 col.3
----- ---------- -----
y                m1   
y             22 m2   
n             45 m2   
y             10 m5   

and T2:

col.1 col.2   col.3
----- ------- -----
    1 germany xxx  
    2 england xxx  
    3 germany uzt  
    3 germany vvx  
    8 US      XXX  

so

  • from T1/Col.1 I should get 'y'
  • from T1/col.3 I should get 'm2'
  • from T2/col.3 I should get 'xxx'

and so on.

Alex Poole
  • 183,384
  • 11
  • 179
  • 318
FBR
  • 25
  • 6

1 Answers1

1

The important error in what has been reported to you is this one:

ORA-00933: SQL command not properly ended

Remove the semicolon from the query inside the dbms_xmlgen.getxml() call:

select  table_name,
        column_name,
        xmlquery('/ROWSET/ROW/C/text()'
            passing xmltype(dbms_xmlgen.getxml( 'select ' || column_name  || ' from (select ' || column_name ||', rank () over (order by count(*) desc) as rnk from ' 
            || table_name || ' Group by ' || column_name || ') where rnk = 1'))
                                                                     -------^ no semicolon here
            returning content) as C
from all_tab_columns
...

Your XPath seems to be wrong too though; you're looking for /ROWSET/ROW/C, but C is the column alias for the entire expression, not the column being counted. You need to alias the column name within the query, and use that in the XPath:

select  table_name,
        column_name,
        xmlquery('/ROWSET/ROW/COL/text()'
                           -- ^^^
            passing xmltype(dbms_xmlgen.getxml( 'select ' || column_name  || ' as col from (select ' || column_name ||', rank () over (order by count(*) desc) as rnk from ' 
                                                                            -- ^^^^^^
            || table_name || ' Group by ' || column_name || ') where rnk = 1'))
            returning content) as C
from all_tab_columns
... 

With your sample data that gets:

TABLE_NAME                     COLUMN_NAME                    C         
------------------------------ ------------------------------ ----------
T1                             col.1                          y         
T1                             col.2                          224510    
T1                             col.3                          m2        
T2                             col.1                          3         
T2                             col.2                          germany   
T2                             col.3                          xxx       

db<>fiddle

The XMLQuery is returning an XMLtype result, which your client is apparently showing as (XMLTYPE). You can probably change that behaviour - e.g. in Sql Developer from Tool->Preferences->Database->Advanced->DIsplay XMl Value in Grid. But you can also convert the reult to a string, using getStringVal() to return a varchar2 (or getClobVal() if you have CLOB values, which might cause you other issues):

select  table_name,
        column_name,
        xmlquery('/ROWSET/ROW/COL/text()'
            passing xmltype(dbms_xmlgen.getxml( 'select ' || column_name  || ' as col from (select ' || column_name ||', rank () over (order by count(*) desc) as rnk from ' 
            || table_name || ' Group by ' || column_name || ') where rnk = 1'))
            returning content).getStringVal() as C
                           -- ^^^^^^^^^^^^^^^
from all_tab_columns
... 

As you can see, this doesn't do quite what you might expect when there are ties due to equal counts - in your example, there are found different values for T1."col.2" (null, 10, 22, 45) which each appear once; and the XMLQuery is sticking them all together in one result. You need to decide what you want to happen in that case; if you only want to see one then you need to specify how to decide to break ties, within the analytic order by clause.

I actually want to see all results but I expected to see them in different rows

An alternative approach that allows that is to use XMLTable instead of XMLQuery:

select table_name, column_name, value
from (
  select atc.table_name, atc.column_name, x.value, x.value_count,
    rank() over (partition by atc.table_name, atc.column_name
      order by x.value_count desc) as rnk
  from all_tab_columns atc
  cross join xmltable(
    '/ROWSET/ROW'
    passing xmltype(dbms_xmlgen.getxml(
           'select "' || column_name  || '" as value, count(*) as value_count '
        || 'from ' || table_name || ' '
        || 'group by "' || column_name || '"'))
    columns value varchar2(4000) path 'VALUE',
            value_count number path 'VALUE_COUNT'
  ) x
  where atc.owner = user
  and atc.table_name in ('T1', 'T2', 'T3', 'T4')
)
where rnk = 1;

The inner query cross-joins all_tab_columns to an XMLTable which does a simpler dbms_xmlgen.get_xml() call to just get every value and its count, extracts the values and counts as relational data from the generated XML, and includes the ranking function as part of that subquery rather than within the XML generation. If you run the subquery on its own you'll see all possibel values and their counts, along with each values' ranking.

The outer query then just filters on the ranking, and shows you the relevant columns from the subquery for the first-ranked result.

db<>fiddle

Alex Poole
  • 183,384
  • 11
  • 179
  • 318
  • Now another issue appeared. The query works but in result all Top-values are NULL. Do you have any idea why that is? because top values are for sure not Null at least not in all cases! – FBR Dec 10 '18 at 15:20
  • What 'top-value' are you expecting to see? Maybe edit your question to show a sample table with data, and the output you are trying to get for that data. – Alex Poole Dec 10 '18 at 15:25
  • well. in many fields there is no NULL at all. for instance in some there is only "Y" or "N". in other fields I should see a number from 0 to 10000. By this, I mean I assume the NULL that appear as the result of the query above, seems to be not read from the tables! – FBR Dec 10 '18 at 15:27
  • Thanks. I understood my technical problem. However, now in column.C I get : (XMLTYPE) mainly and sometimes NULL – FBR Dec 10 '18 at 16:28
  • @FBR - added a note about that... the nulls are probably correct though. – Alex Poole Dec 10 '18 at 16:43
  • Thank you very much. Regarding the issue with the case when I have 2 values with identical counts, I actually want to see all results but I expected to see them in different rows, since this is what I get when I execute the single query for 1 field: select col.1 from (SELECT col.1, rank () over (order by count(*) desc) as rnk from T1 Group by col.1 ) where rnk = 1 – FBR Dec 11 '18 at 11:10
  • @FBR - I've added an alternative approach to do that. (I hadn't noticed that the result I showed was already concatenating some of the values, so I've taken out the separate example with the same number of y/n). – Alex Poole Dec 11 '18 at 11:57
  • Thank you..this is great! Do you know where in query should I add "trim" before the column_nam ein order to get rid of the potential empty spaces? the query fails once I add it! – FBR Dec 11 '18 at 14:56
  • well, this is sorted :) – FBR Dec 11 '18 at 16:43
  • It is not easy to work with this package it seems..with filtering in my where clause within the package I get this error: ORA-06502: PL/SQL: numeric or value error ORA-06512: at "SYS.XMLTYPE", line 272 ORA-06512: at line 1 06502. 00000 - "PL/SQL: numeric or value error%s" *Cause: *Action: – FBR Dec 13 '18 at 09:21
  • I can't guess what filtering you've added. Ask a new question showing your full current query and the error it gets; and include the DDL and sample data for the tables it's interrogating. – Alex Poole Dec 13 '18 at 09:24
  • and it seems it is length-related. for example if I say ....where EXAMPLE_COLUMN like '12%B' it works... but if I say: ....where EXAMPLE_COLUMN like '1234%B' then It it returns that error! – FBR Dec 13 '18 at 09:25