Is it possible to use pandas to insert and read numpy arrays in fields in a sqlite database?
I work with pandas dataframes and use the pandas builtin functions such as pandas.to_sql(). This works fine with text and numbers, but I would like to store a numpy array in each field.
I tried to do this using a method described in a question "Python insert numpy array into sqlite3 database" https://stackoverflow.com/a/18622264/5321138. That explains very well how to store numpy arrays in sqlite using sqlite3. I would like to keep using pandas however. I tried the most simple approach I could think of:
import numpy as np
import pandas as pd
import sqlite3
import io
# create 3 variables of different type
value_1 = np.linspace(1,4,6)
value_2 = 42
value_3 = 'word'
print('Types of variables:')
print(type(value_1))
print(type(value_2))
print(type(value_3))
# put them in a pandas dataframe
v_dict={'v1': [value_1], 'v2':[value_2], 'v3':[value_3]}
df=pd.DataFrame(data=v_dict)
# print the types of the dataframe
print('Types of dataframe')
print(df.dtypes)
print('Types of elements of dataframe')
print(type(df['v1'].values[0]))
print(type(df['v2'].values[0]))
print(type(df['v3'].values[0]))
# make adapter and converter for numpy array that works for sqlite
# https://stackoverflow.com/questions/18621513/python-insert-numpy-array-
into-sqlite3-database
def adapt_array(arr):
"""
http://stackoverflow.com/a/31312102/190597 (SoulNibbler)
"""
out = io.BytesIO()
np.save(out, arr)
out.seek(0)
return sqlite3.Binary(out.read())
def convert_array(text):
out = io.BytesIO(text)
out.seek(0)
return np.load(out)
# Converts np.array to TEXT when inserting
sqlite3.register_adapter(np.ndarray, adapt_array)
# Converts TEXT to np.array when selecting
sqlite3.register_converter("array", convert_array)
conn = sqlite3.connect('sqlite_file.sqlite', detect_types=sqlite3.PARSE_DECLTYPES)
df.to_sql('tablen', conn, if_exists='append', index=False)
out=pd.read_sql_query('SELECT * FROM tablen', con=conn)
print('Types of elements of dataframe from sqlite')
print(type(out['v1'].values[0]))
print(type(out['v2'].values[0]))
print(type(out['v3'].values[0]))
However the adapter and converter I register in sqlite3 are aparently not picked-up by pandas as the type of v1 is "bytes" and not "numpy.array"
Is there an elegant way to keep using pandas with a sqlite database and have numpy arrays in fields? Or should I make some dedicated methods to convert my pandas dataframes having numpy arrays to sqlite and vice versa using the sqlite3 module?