1

Is it possible to use pandas to insert and read numpy arrays in fields in a sqlite database?

I work with pandas dataframes and use the pandas builtin functions such as pandas.to_sql(). This works fine with text and numbers, but I would like to store a numpy array in each field.

I tried to do this using a method described in a question "Python insert numpy array into sqlite3 database" https://stackoverflow.com/a/18622264/5321138. That explains very well how to store numpy arrays in sqlite using sqlite3. I would like to keep using pandas however. I tried the most simple approach I could think of:

import numpy as np
import pandas as pd
import sqlite3
import io

# create 3 variables of different type
value_1 = np.linspace(1,4,6)
value_2 = 42
value_3 = 'word'

print('Types of variables:')
print(type(value_1))
print(type(value_2))
print(type(value_3))

# put them in a pandas dataframe
v_dict={'v1': [value_1], 'v2':[value_2], 'v3':[value_3]}
df=pd.DataFrame(data=v_dict)

# print the types of the dataframe
print('Types of dataframe')
print(df.dtypes)

print('Types of elements of dataframe')
print(type(df['v1'].values[0]))
print(type(df['v2'].values[0]))
print(type(df['v3'].values[0]))

# make adapter and converter for numpy array that works for sqlite 
# https://stackoverflow.com/questions/18621513/python-insert-numpy-array- 
into-sqlite3-database

def adapt_array(arr):
    """
    http://stackoverflow.com/a/31312102/190597 (SoulNibbler)
    """
    out = io.BytesIO()
    np.save(out, arr)
    out.seek(0)
    return sqlite3.Binary(out.read())

def convert_array(text):
    out = io.BytesIO(text)
    out.seek(0)
    return np.load(out)

# Converts np.array to TEXT when inserting
sqlite3.register_adapter(np.ndarray, adapt_array)

# Converts TEXT to np.array when selecting
sqlite3.register_converter("array", convert_array)

conn = sqlite3.connect('sqlite_file.sqlite', detect_types=sqlite3.PARSE_DECLTYPES)
df.to_sql('tablen', conn, if_exists='append', index=False)

out=pd.read_sql_query('SELECT * FROM tablen', con=conn)

print('Types of elements of dataframe from sqlite')
print(type(out['v1'].values[0]))
print(type(out['v2'].values[0]))
print(type(out['v3'].values[0]))

However the adapter and converter I register in sqlite3 are aparently not picked-up by pandas as the type of v1 is "bytes" and not "numpy.array"

Is there an elegant way to keep using pandas with a sqlite database and have numpy arrays in fields? Or should I make some dedicated methods to convert my pandas dataframes having numpy arrays to sqlite and vice versa using the sqlite3 module?

JCP
  • 11
  • 5

1 Answers1

2

I think you need to pass thesqlite3.PARSE_DECLTYPES option (see this comment):

conn = sqlite3.connect('sqlite_file.sqlite', detect_types=sqlite3.PARSE_DECLTYPES)

You can also apply the transformation after loading the dataframe:

out['v1'] = out['v1'].apply(convert_array)
warpri81
  • 571
  • 2
  • 6
  • Thanks, your second option works. The first does not help, though you are right is should be there, it just does not solve the problem. It seems the adapter and converter are both just not used by the pandas comands. – JCP May 29 '18 at 07:32