1

Based on questions like SQL to find the number of distinct values in a column and https://gis.stackexchange.com/questions/330932/get-line-length-using-sql-in-qgis

I see we can get a count and list of unique values using SQL but I can't see anything where we can do this without knowing the name of the field.

Is it possible in SQL for QGIS which only allows these commands? I found this option for another flavor -https://dataedo.com/kb/query/sql-server/list-table-columns-in-database

In Mapbasic I have used the following but would like to do this in SQL...

'Get Column Name list
dim x as integer
dim sColName as string
dim aColName as Alias
For x=1 to TableInfo(temptable, TAB_INFO_NCOLS)
    sColName = ColumnInfo(temptable, "col"+str$(x), COL_INFO_NAME)
    if (sColName not in ("GID","GID_New")) then
        aColName = sColName

        Select aColName, count(*) from temptable group by aColName into "g_"+sColName
        Browse * from "g_"+sColName
        Export "g_"+sColName Into WFolder+RSelection.col2+"_"+sColName+".csv" Type "ASCII" Delimiter "," CharSet "WindowsLatin1" Titles
    End If
Next

I guess in SQL we would use http://www.sqlservertutorial.net/sql-server-basics/sql-server-select-distinct/ but how can I tell it to just use every column in the table without knowing/specifying the name?

UPDATE

If I run

SELECT DISTINCT * FROM Drainage_Lines_Clip;

I get enter image description here

But I need something like the following without having to specify the column name. Ref enter image description here

It should look like this extract from running Unique on a google sheet of the data (except with counts) enter image description here

GeorgeC
  • 956
  • 5
  • 16
  • 40

2 Answers2

1

So this answer is based upon dynamic SQL. You'll get people saying 'don't use it it's dangerous', but they're the kind of people that think the best access to a system for users is none.. Anyway. Be aware of the security risks with SQL injection when using dynamic SQL. I'll leave that part up to you..

The below goes off to the sys.columns table and grabs all of the column names in the table, then a SQL statement is constructed to count all of the values in each column in your target table.

DECLARE @ReturnVar NVARCHAR(MAX);

SELECT @ReturnVar = COALESCE(@ReturnVar + ' UNION ALL ', '') + 'SELECT ''' + c.[name] + ''' [ColumnName], CAST(' + c.[name] + ' AS VARCHAR(MAX)) [ColumnValue], CAST(COUNT(1) AS VARCHAR(MAX)) [Count] FROM dbo.Admissions GROUP BY  CAST(' + c.[name] + ' AS VARCHAR(MAX))'
FROM sys.columns c
INNER JOIN sys.objects o ON o.object_id = c.object_id
INNER JOIN sys.schemas s ON s.schema_id = o.schema_id
WHERE o.[name] = 'Drainage_Lines_Clip'
AND s.[name] = 'dbo'
AND c.[name] != 'GID_New';

EXEC sp_executesql @ReturnVar;
Jim Jimson
  • 2,368
  • 3
  • 17
  • 40
  • I get an error -it's probably since this is a flat geospatial file and not a postgis database. So it's not dbo.Drainage_Lines.Clip.GID_New for example. The error is [Query execution error on CREATE TEMP VIEW _tview AS DECLARE @ReturnVar NVARCHAR(MAX); EXEC sp_executesql @ReturnVar;: 1 - near "DECLARE": syntax error] – GeorgeC Aug 04 '19 at 11:07
  • Ohh. I've just realised your not using a database, just an interpreter over a flat file. My answer won't apply then as it's only for SQL Server. What software are you using to run the SQL query? – Jim Jimson Aug 04 '19 at 11:11
  • QGIS.org - I've been trying to locate some info on the flavor of SQL and commands it allows but haven't been able to. – GeorgeC Aug 04 '19 at 11:14
  • Based on the following post I think it's https://sqlite.org/lang.html and https://www.gaia-gis.it/fossil/libspatialite/index -- So Select Spatialite_version() gives me Spatialite_version() = 4.3.0 -- Ref: https://medium.com/@GispoFinland/learn-spatial-sql-and-master-geopackage-with-qgis-3-16b1e17f0291 – GeorgeC Aug 04 '19 at 11:35
  • I've had a play around, but I can't figure it out. Here's the fiddle: http://sqlfiddle.com/#!5/320e4/21 I get the feeling sqllite isnt meant for what we're trying to do. You could fire a query using python though, that would work.. – Jim Jimson Aug 05 '19 at 04:32
  • 1
    Thanks - this is what I'm trying now - so --- query = """Select Material, count(*) from Sewerage_Manholes_InspectionShafts group by Material order by 2 Desc""" vlayer = QgsVectorLayer( "?query={}".format(query), "vlayer", "virtual" ) --- Does the needful and I'll just have to wrap it in a script that get's the table column names and parses it to the sql command and build up the table. – GeorgeC Aug 05 '19 at 04:38
1

I ended up having to use a combination of PyQGIS and SQL to get what's needed.

layer = qgis.utils.iface.activeLayer()
fields=[] # List of fields
Lquery=[] # List of queries to join together with Union All statement
Cquery=[] # Combined Query to use
for field in layer.fields():
    if field.name() not in ('GID_New'):
        fields.append(field.name())
        query = "Select '{0}' as 'Column', {0} as 'Value', count(*) as 'Unique' from {1} group by {0}".format(field.name(), layer.name())
        Lquery.append(query)
    else:
        print (field.name())
#    query = "Select {0}, count(*) from {1} group by {0} order by 2 Desc".format(field.name(), layer.name())

for L in Lquery:
    Cquery.append(L+' Union All ')
query=''.join(map(str, Fquery))
query=query[:-11]+' Order by Column'
vlayer = QgsVectorLayer( "?query={}".format(query), 'counts_'+layer.name(), "virtual" )
QgsProject.instance().addMapLayer(vlayer)
GeorgeC
  • 956
  • 5
  • 16
  • 40