2

I want to create a database with a matrix with about 3000 columns and 2500 rows on sqlite. However, it has the 2000 column limitation. I downloaded sqlite-amalgamation from https://www.sqlite.org/download.html and changed SQLITE_MAX_COLUMN to 3000 on sqlite3.c file. My O.S. is Ubuntu. I tried to uninstall sqlite3 with

sudo apt-get remove sqlite3
sudo apt-get autoremove sqlite3
sudo apt-get autoremove --purge sqlite3

It seems sqlite3 is uninstalled. But when I type import sqlite3 in anaconda spyder, it seems sqlite3 is still exists. I restarted my computer but nothing changed. However, I tried to compile modified sqlite3. I run

./configure

Then I run

make

and got this error

make: Nothing to be done for 'all'.

And Finally nothing changed and when I use the following code to import the 2500*3000 matrix in my excel file to my database

import sqlite3
import pandas as pd
connection_obj = sqlite3.connect('MyDataBase.db') 
MyTable = pd.read_excel('my_excel.xlsx', sheet_name='sheet1')
MyTable.to_sql('MyTable', connection_obj, index=False)
connection_obj.commit()

I get this error

OperationalError: too many columns on MyTable

What should I do? Thanks.

leila
  • 21
  • 1

1 Answers1

0

Instead of creating a table with a huge number of columns

CREATE TABLE MyTable (
    RowID INTEGER PRIMARY KEY,
    Column1 REAL,
    Column2 REAL,
    ...
    Column3000 REAL
);

Try reorganizing it like this:

CREATE TABLE MyTable (
    RowID INTEGER NOT NULL,
    ColumnID INTEGER NOT NULL,
    Value REAL,
    PRIMARY KEY (Row, ColumnID)
);

That way, you only need 3 columns, and the SQLite table can hold up to 264 rows, well more than enough to handle the 7.5 million cells from your Excel sheet.

Here's a Python generator function that may help with the transformation. (Disclaimer: I have only tested it with the builtin list of list, not with Pandas table objects.)

def row_col_enumerate(matrix):
    for (row, row_data) in enumerate(matrix):
        for (col, value) in enumerate(row_data):
            yield (row, col, value)

Example:

>>> for rcv in row_col_enumerate([[1, 2, 3], [4, 5, 6], [7, 8, 9]]):
...     print(rcv)
...
(0, 0, 1)
(0, 1, 2)
(0, 2, 3)
(1, 0, 4)
(1, 1, 5)
(1, 2, 6)
(2, 0, 7)
(2, 1, 8)
(2, 2, 9)
dan04
  • 87,747
  • 23
  • 163
  • 198