24

I have installed python 2.7.0 and Teradata module on Windows 7. I am not able to connect and query TD from python.

pip install Teradata

Now I want to import teradata module in my source code and perform operations like -

  1. Firing queries to teradata and get result set.
  2. Check if connection is made to teradata.

Please help me writing code for the same as I am new to Python and there is no information available with me to connect to teradata.

anurag
  • 590
  • 3
  • 8
  • 27
  • 2
    There is information here https://developer.teradata.com/tools/reference/teradata-python-module - You will also need the Teradata ODBC drivers http://downloads.teradata.com/download/connectivity/odbc-driver/windows – KeepCalmAndCarryOn Mar 15 '16 at 02:58

3 Answers3

46

There are a number of ways to connect to Teradata and export table to Pandas. Here are four+:

Using teradata module

# You can install teradata via PIP: pip install teradata
# to get a list of your odbc drivers names, you could do: teradata.tdodbc.drivers
# You don’t need to install teradata odbc driver if using method='rest'.     
# See sending data from df to teradata for connection example 

import teradata
import pandas as pd

host,username,password = 'HOST','UID', 'PWD'
#Make a connection
udaExec = teradata.UdaExec (appName="test", version="1.0", logConsole=False)


with udaExec.connect(method="odbc",system=host, username=username,
                            password=password, driver="DRIVERNAME") as connect:

    query = "SELECT * FROM DATABASEX.TABLENAMEX;"

    #Reading query to df
    df = pd.read_sql(query,connect)
    # do something with df,e.g.
    print(df.head()) #to see the first 5 rows

Using TeradataSQL

from @ymzkala : This package doesn't require you to install Teradata drivers (other than this package).

# Installing python -m pip install teradatasql

import teradatasql

with teradatasql.connect(host='host', user='username', password='password') as connect:
    df = pd.read_sql(query, connect)

Using pyodbc module

import pyodbc

 #You can install teradata via PIP: pip install pyodbc
 #to get a list of your odbc drivers names, you could do: pyodbc.drivers()

#Make a connection
link = 'DRIVER={DRIVERNAME};DBCNAME={hostname};UID={uid};PWD={pwd}'.format(
                      DRIVERNAME=DRIVERNAME,hostname=hostname,  
                      uid=username, pwd=password)
with pyodbc.connect(link,autocommit=True) as connect:

    #Reading query to df
    df = pd.read_sql(query,connect)

Using sqlalchemy Module

 #You can install sqlalchemy via PIP: pip install sqlalchemy-teradata
 #Note: It is not pip install sqlalchemy. If you already have sqlalchemy, you still need sqlalchemy-teradata to get teradata dialects

from sqlalchemy import create_engine

#Make a connection

link = 'teradata://{username}:{password}@{hostname}/?driver={DRIVERNAME}'.format(
               username=username,hostname=hostname,DRIVERNAME=DRIVERNAME)

with create_engine(link) as connect:

    #Reading query to df
    df = pd.read_sql(query,connect)

There is a fifth way, using giraffez module. I enjoy using this module as it come with MLOAD, FASTLOAD, BULKEXPORT etc. The only issue for beginners is its requirements (e.g C/C++ compiler ,Teradata CLIv2 and TPT API headers/lib files).

Note: Updated 13-07-2018, using of context manager to ensure closing of sessions

Update: 31-10-2018: Using teradata to send data from df to teradata

We can send data from df to Teradata. Avoiding 'odbc' 1 MB limit and odbc driver dependency, we can use 'rest' method. We need host ip_address, instead of driver argument. NB: The order of columns in df should match the order of columns in Teradata table.

import teradata
import pandas as pd

# HOST_IP can be found by executing *>>nslookup viewpoint* or *ping  viewpoint* 
udaExec = teradata.UdaExec (appName="test", version="1.0", logConsole=False) 
with udaExec.connect(method="rest",system="DBName", username="UserName",
                      password="Password", host="HOST_IP_ADDRESS") as connect:

    data = [tuple(x) for x in df.to_records(index=False)]

    connect.executemany("INSERT INTO DATABASE.TABLEWITH5COL values(?,?,?,?,?)",data,batch=True)

Using 'odbc', you have to chunk your data to less than 1MB chunks to avoid "[HY001][Teradata][ODBC Teradata Driver] Memory allocation error" error: E.g.

import teradata
import pandas as pd
import numpy as np

udaExec = teradata.UdaExec (appName="test", version="1.0", logConsole=False)

with udaExec.connect(method="odbc",system="DBName", username="UserName",
                      password="Password", driver="DriverName") as connect:

    #We can divide our huge_df to small chuncks. E.g. 100 churchs
    chunks_df = np.array_split(huge_df, 100)

    #Import chuncks to Teradata
    for i,_ in enumerate(chunks_df):

        data = [tuple(x) for x in chuncks_df[i].to_records(index=False)]
        connect.executemany("INSERT INTO DATABASE.TABLEWITH5COL values(?,?,?,?,?)",data,batch=True)
Prayson W. Daniel
  • 14,191
  • 4
  • 51
  • 57
  • teradata.api.DatabaseError: (0, '[HY000] [Teradata][ODBC Teradata Driver] Loading the Teradata ICU Library Failed. Error is: 126, The driver returned invalid (or failed to return) SQL_DRIVER_ODBC_VER: 03.80') – Sujit Dhamale Apr 03 '18 at 16:46
  • When you print(teradata.tdodbc.drivers), can you see the correct driver name? My initial thought is that you might have install/update your teradata odbc driver and need to explicitly assign driver a correct name as default ‘teradata’ does not match. – Prayson W. Daniel Apr 05 '18 at 03:58
  • Note: The best way is to use ‘with’ syntax .This will ensure closing of sessions. – Prayson W. Daniel May 28 '18 at 17:06
  • I tried print(teradata.tdodbc.drivers), I got None. But I am still connecting using Pyodbc. I have issues with Sqlalchemy as NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:teradata. I'm trying to write a df to Teradata. – Ramsey Oct 30 '18 at 19:50
  • I enjoy using teradata packages as it is maintained by teradata. I get df to teradata question a lot so I will edit my answer above – Prayson W. Daniel Oct 31 '18 at 04:56
  • @PraysonDaniel, Thanks for your reply. I created a sample df with 5 rows and 35 columns. I got the below error, but still it loads the 5 rows to Teradata table. Any ideas? ERROR:teradata.udaexec:Query Failed! Duration: 0.200 seconds, Query: INSERT INTO db.table values(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) Traceback (most recent call last): DatabaseError: (0, u'[HY024] [Microsoft][ODBC Driver Manager] Invalid argument value') – Ramsey Oct 31 '18 at 15:45
  • What do you mean by rest method? – Ramsey Oct 31 '18 at 21:46
  • You are using a wrong driver. You need Teradata driver and not Microsoft. You can either down teradata odbc driver and use that or change method from 'odbc' to 'rest' and use Teradata RestAPI – Prayson W. Daniel Feb 20 '19 at 12:13
  • using Using teradata module its showing "(434, '[HY000] [Teradata][WSock32 DLL] (434) WSA E TimedOut: No response received when attempting to connect to the Teradata server')" . any idea hoe to resolve this?? – c0der Apr 25 '19 at 06:28
  • I had to add the following ``` import teradata udaExec = teradata.UdaExec (appName="test", version="1.0", logConsole=False) with udaExec.connect(method="odbc",system="servername", username="****",password="****",driver="Teradata Database ODBC Driver 16.20",authentication="LDAP") as connect: data =pd.read_sql('select top 10 * from DBName.Table,connect) ``` – user2162611 Sep 23 '21 at 14:33
11

To add on to Prayson's answer, you can use the teradatasql package (found on pypi). This package doesn't require you to install Teradata drivers (other than this package). Use it like so:

import teradatasql
import pandas as pd

with teradatasql.connect(host='host', user='username', password='password') as connect:
    data = pd.read_sql('select top 5 * from table_name;', connect)

ymzkala
  • 323
  • 3
  • 7
4

Download the Teradata Python module and python pyodbc.pyd from internet. Install using cmd install setup.py.

Here is the sample script for connecting to teradata and extracting data:

import teradata
import pyodbc
import sys



udaExec = teradata.UdaExec (appName="HelloWorld", version="1.0",
        logConsole=False)

session = udaExec.connect(method="odbc", dsn="prod32",
        username="PRODRUN", password="PRODRUN");

i = 0
REJECTED = 'R';

f = file("output.txt","w");sys.stdout=f

cursor =  session.cursor();

ff_remaining = 0;

cnt = cursor.execute("SELECT  SEQ_NO,FRQFBKDC,PNR_RELOC FROM ttemp.ffremaining ORDER BY 1,2,3 ").rowcount;
rows = cursor.execute("SELECT  SEQ_NO,FRQFBKDC,PNR_RELOC FROM ttemp.ffremaining ORDER BY 1,2,3 ").fetchall();


for i in range(cnt):
    ff_remaining = cursor.execute("select count(*) as coun from  ttemp.ffretroq_paxoff where seq_no=? and status <> ?",(rows[i].seq_no,REJECTED)).fetchall();
    print ff_remaining[0].coun, rows[i].seq_no, REJECTED;
illright
  • 3,991
  • 2
  • 29
  • 54
  • session = udaExec.connect(method="odbc", dsn="prod32",driver=DriverName,username="PRODRUN", password="PRODRUN"); We also need to pass driver as a parameter while creating a session. – Ishan mahajan Dec 01 '17 at 05:58
  • getting below error teradata.api.DatabaseError: (0, '[HY000] [Teradata][ODBC Teradata Driver] Loading the Teradata ICU Library Failed. Error is: 126, The driver returned invalid (or failed to return) SQL_DRIVER_ODBC_VER: 03.80') – Sujit Dhamale Apr 03 '18 at 16:46
  • 1
    Is there a reason to use both teradata and pyodbc. ;)? – Prayson W. Daniel May 02 '18 at 11:50