2

I have a huge csv file which contains millions of records and I want to load it into Netezza DB using python script I have tried simple insert query but it is very very slow. Can point me some example python script or some idea how can I do the same?

Thank you

Dhams
  • 251
  • 7
  • 15
  • Wait... you're trying to execute individual insert statments against Netezza?? Just use the nzload command line and shell out to it from your python script... – N West Mar 25 '13 at 20:31
  • Do you have any idea how can I do that? even what do I need to run nzload from command line or shell because I have only install aginity workbench to connect to the netezza database do I required anything else to run nzload? – Dhams Mar 26 '13 at 16:21

3 Answers3

3

If you have the Netezza Admin Tools installed, you can make a shell call to nzload.

Something like this:

os.system("nzload -host myhost -u myuser -pw mypassword -db MYDB -t MYTABLE -skipRows 1 -nullValue NULL -quotedValue DOUBLE -df \"C:\\myinputfile.txt\" -delim \"|\" -dateDelim \"-\" -lf c:\\logs\\nzload.log -bf c:\\logs\\nzloadbad.log")

See also: Calling an external command in Python

Community
  • 1
  • 1
Edward J. Stembler
  • 1,932
  • 4
  • 30
  • 53
1

you can use nz_load4 to load the data,This is the support utility /nz/support/contrib/bin

the syntax is same like nzload,by default nz_load4 will load the data using 4 thread and you can go upto 32 thread by using -tread option for more details use nz_load4 -h

This will create the log files based on the number of thread,like if

0

You need to get the nzcli installed on the machine that you want to run nzload from - your sysadmin should be able to put it on your unix/linux application server. There's a detailed process to setting it all up, caching the passwords, etc - the sysadmin should be able to do that to.

Once it is set up, you can create NZ control files to point to your data files and execute a load. The Netezza Data Loading guide has detailed instructions on how to do all of this (it can be obtained through IBM).

You can do it through aginity as well if you have the CREATE EXTERNAL TABLE privledge - you can do a INSERT INTO FROM EXTERNAL ... REMOTESOURCE ODBC to load the file from an ODBC connection.

N West
  • 6,768
  • 25
  • 40
  • I have used INSERT INTO FROM EXTERNAL ... REMOTESOURCE ODBC to load the data its worked for 1000 records but when I am trying to load 7,00,000 records using the same command from aginity it is giving me ERROR [08S01] Communication link failure. Any idea what could be the issue? – Dhams Apr 01 '13 at 22:24
  • I've had it happen with Aginity from time to time. I think it might be something with the max query time settings for the user id in Netezza, but I don't really know. Maybe try it with a little C# or powershell command line script to execute the SQL statement against the ODBC connection and see if it still gives that error? – N West Apr 02 '13 at 12:01
  • I tried the same thing with python script using pyodbc library but getting same error. I don't know how can I try with the powershell command line and how can I install the same. – Dhams Apr 02 '13 at 14:25
  • Well if it failed in pyodbc, it's probably a problem with the connection between the NZ DB and your server. Maybe you could try splitting the file and seeing how much you can load before it fails? Might be time for a call to your DBA or IBM. – N West Apr 02 '13 at 20:40
  • I normally use nzload CLI for this. Create an empty table, and then insert using nzload. As your question has been answered, i suggest you accept it as this is how the site works. – marbel Mar 16 '14 at 16:15