I am using jpype
and jaydebeapi
to connect to an Oracle database which works fine. However, the SQL returns tuples instead of strings with brackets and a comma after each character.
E.g. instead of dog
it returns (d,o,g,)
Running the same SQL statement in DBeaver or Toad returns the string without alteration as you would expect.
Any idea on how to return the string without brackets and comma please? (I can still use the data but it looks weird in a chart and makes the chart and other outputs harder to read).
I have already searched Google and stackoverflow for similar issues (link1,link2,link3) but non of these solutions have worked for me. I need to find a way to loop through all columns to find those with dtype string and then join all tuples in those.
Here is a screenshot of how the data is being returned:
Here is a screenshot of a pivot (showing the data can be used without using brackets and commas):
Here is a screenshot of a chart (again showing string with brackets and commas):
The code I'm using for the connection is as follows (no issues with this as far as I can see?):
import jpype, jaydebeapi
import pandas as pd
jpype.startJVM(f"-Djava.class.path=D:\jdbc\ojdbc8.jar")
url = "jdbc:oracle:thin:@(DESCRIPTION=(CONNECT_TIMEOUT=1000)(TRANSPORT_CONNECT_TIMEOUT=1000)(ADDRESS_LIST=(LOAD_BALANCE=ON)(ADDRESS=(PROTOCOL=TCP)(HOST="")(PORT=""))(ADDRESS=(PROTOCOL=TCP)(HOST="")(PORT=""))(ADDRESS=(PROTOCOL=TCP)(HOST="")(PORT="")))(ADDRESS_LIST=(LOAD_BALANCE=ON)(ADDRESS=(PROTOCOL=TCP)(HOST="")(PORT=""))(ADDRESS=(PROTOCOL=TCP)(HOST="")(PORT=""))(ADDRESS=(PROTOCOL=TCP)(HOST="")(PORT="")))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME="")))"
user = ""
password = ""
conn=jaydebeapi.connect("oracle.jdbc.OracleDriver", url, [user, password])
cursor = conn.cursor()
sql_query = ''''''
cursor.execute(sql_query)
data = cursor.fetchall()
columns = [desc[0] for desc in cursor.description]
columns = [''.join(i) for i in columns]
cursor.close()
data = pd.DataFrame(data)
data.columns = columns
pd.options.display.float_format = '{:,.0f}'.format
pivot = pd.pivot_table(data, index=['COUNTRY'], values=['CUSTOMERS'], aggfunc=(sum))
pivot.plot(kind='bar')
conn.close()
jpype.shutdownJVM()
Here is what I have tried so far:
for col in data.columns:
for ix in data.index:
data[ix][col] = ''.join(data[ix][col])
This returns a
Keyerror:0
data.select_dtypes(include=['object']).agg(' '.join)
#and
data.select_dtypes(include=['object']).apply(lambda x: ''.join(x))
Both of these return a
TypeError: sequence item 0: expected str instance, java.lang.String found
I have of course looked up these error messages but can't tell what I'm doing wrong.
Would really appreciate some help!
Thanks!!!
UPDATE:
I just noticed that when I turn dtype object
into category
(which I did so I could order the column in a pivot table) it shows the string without brackets and commas. So then I tried to convert all object
columns into category
using this code per this post link4:
data = pd.concat([
data.select_dtypes([], ['object']),
data.select_dtypes(['object']).apply(pd.Series.astype, dtype='category')
], axis=1).reindex(data.columns, axis=1)
However, now I am getting the following error:
ValueError: setting an array element with a sequence
If I do each column one by one it seems to be working though :-S