I use a HANA 2.0 database. I want to export a table or a sql query from the database to an external client as fast as possible and using a command line (i'm on windows)
I already tested differents cli methods with success but not as fast as a gui method (using HANA Studio Export)
The CSV generated is around 400MB, in all cases it is exactly the same data.
I successfully exported the result of a SQL query using hdbsql and -o option
$periodstr = '202012'
$TargetFile = "OTIF_${periodstr}.csv"
$HanaUserKey = "MAESTRO_VAL"
$CSVDelimiter = '¤' ## US : User Separator ascii 31
$UserQuotes = "Never"
$encoding = "UTF8"
Measure-Command{hdbsql -U $HanaUserKey -o .\${TargetFile} -I D:\Applications\python\Hana2Hyper\extractors\MAESTRO\sql\OTIF4HDBSQL.SQL -V period=$periodstr -F $CSVDelimiter -x -resultencoding $encoding}
With hdbsql it took 65s to export Network BP used ~40Mbs:
I successfully exported using ODBC (HDBODBC)
$TargetDir = "D:\Applications\python\Hana2Hyper\workdir"
$periodstr = '202012'
$TargetFile = "OTIF_${periodstr}.csv"
$HanaUserKey = "@MAESTRO_VAL"
$SQLQuery = [IO.File]::ReadAllText("D:\Applications\python\Hana2Hyper\extractors\MAESTRO\sql\OTIF4HDBSQL.SQL")
$SQLQuery = $SQLquery.replace('&period',$periodstr)
$CSVDelimiter = 31 ## US : User Separator ascii 31
$UserQuotes = "Never"
$encoding = "UTF8"
$conn = new-object System.Data.Odbc.OdbcConnection
$conn.connectionstring = "Driver={HDBODBC};SERVERNODE=${HanaUserKey};PACKETSIZE=2097152; PREFETCH=TRUE"
$conn.Open()
$cmd = New-object System.Data.Odbc.OdbcCommand($SQLQuery,$conn)
$dset = New-Object System.Data.DataSet
$adap = New-Object System.Data.Odbc.OdbcDataAdapter($cmd)
Measure-Command{$nrr = $adap.fill($dset)}
Write-Output "Dataset Records selected: ${nrr}"
#Powershell 7 for UseQuotes
Measure-Command{Export-Csv -InputObject $dset.tables[0] -Path ${TargetDir}\${TargetFile} -Delimiter $CSVDelimiter -noTypeInformation -UseQuotes $UserQuotes -encoding $encoding}
$conn.Close()
it took 57s to export data to a dataset and 1.2s to write the dataset to csv : 58s (faster that hdbsql), Network BP used : ~50Mbs
We also successfully export using python (sqlalchemy + hdbcli) and pandas dataframes in about 35s (Network BP ~90 Mbs). We tried to change some parameters for ODBC/hdbcli like PACKETSIZE=2MB or PREFETCH=TRUE, without improving the cli extraction
And finally I successfully exported the same data that i had previously writen to a table (the create as select took 6s) using HANA Studio
Export using SAP HANA Studio Step 1 Export using SAP HANA Studio Step 2
it took 16s to received the data : more than 3x better !! Network BP used 500Mbs !
Network BP at 500Mbs during SAP HANA Studio Export to client
To summary results we currently have to export the same data :
Type | Method | Elapsed (sec) |
---|---|---|
CLI | hdbsql | 65 |
CLI | HBDODBC+powershell dataset | 58 |
CLI | Python using pandas + sqlalchemy(hdbcli) | 35 |
GUI | SAP HANA Studio | 16 |
The final question is : does someone know how to export data to an external client as fast as GUI SAP HANA Studio but in command line ?