I created my own DB and added my csv data into it successfully via Python. Queries therefore work, but I use python to do the queries. Package sqlite3. In short, the table is: ID of user, Type of action performed, ID of object that action was performed on.
df1 and df2 are the DataFrames.
import pandas as pd
import sqlite3
connection = sqlite3.connect("test.db")
c = connection.cursor()
df1.to_sql('tab1', con = connection, if_exists = 'replace', chunksize = 1000)
df2.to_sql('tab2', con = connection, if_exists = 'replace', chunksize = 1000)
In this SQL command, I need to select the columns, join one table and select only data from certain time. I hope it's okay I include it there for completeness but won't include all columns in example table as they're only for the filtering. Note that the subquery without the pivot works.
sql_command = """
SELECT * FROM
(SELECT tab1.action,
tab1.iduser,
tab2.idobject
FROM tab1,
tab2
LEFT JOIN tab2 ON tab1.objectid = tab2.objectid
WHERE tab2.Datetime BETWEEN '2020-01-29' AND '2020-02-04'
) AS source
PIVOT(
COUNT(tab1.iduser)
FOR tab1.action IN (
[a],
[b],
[c])
) AS pivot_table;"""
# execute query
c.execute(sql_command)
This is the table I get from the subquery basically:
iduser action idobject
--------------------------------------
1 a Obj1
1 b Obj2
1 c Obj3
1 a Obj2
2 a Obj1
2 c Obj2
2 a Obj2
2 b Obj1
3 a Obj1
3 c Obj3
Now I want to create such table, i.e. count the number of types of actions performed on each object.
action idobject Count
----------------------------------------------------------------------
a Obj1 3
Obj2 2
Obj3 0
b Obj1 1
Obj2 1
Obj3 0
c Obj1 0
Obj2 1
Obj3 2
If i run the code above with the PIVOT table, I get the error "Error: Incorrect syntax near "PIVOT"".