2

We are currently using PostgreSQL, now have to save some tables in a database. The tables are never updated once created, but may be filtered.

The tables are dynamic in nature, as there may be n columns,

so a table would be:

|------|--------|--------|
| NAME | DATA 1 | DATA 2 |
|------|--------|--------|

another table would be:

|------|--------|--------|--------|--------|--------|
| NAME | DATA 1 | DATA 2 | DATA 3 | DATA 4 | DATA 5 |
|------|--------|--------|--------|--------|--------|

The data is not normalized because it hurts when dealing with n rows as all rows are read all at once.

These are the solutions that I come up with,

  1. Save the table as JSON in a JSON Type or HStore pairs.
  2. Save the table as CSV data in a Text Field

What are the alternative methods to store the above data? Can NoSQL databases handle this data?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • When you say the tables may be filtered, do you mean you run queries against multiple tables or one of those tables? What performance requirement do you have? Is sequence scan OK? What's the number of tables/cols in tables/rows in tables are we talking about, thousands/millions? – blurrcat Mar 05 '19 at 07:25
  • The queries will done on each of the table, not against multiple tables. Number of columns can be `n` where `n < 12` mostly, rows may be thousands, not more than hundred thousands. –  Mar 05 '19 at 08:43
  • Is your question about schema-less tables capabilities? I mean do you want to store rows with different number of columns in the same table or each table should have static number of columns? – Mikhail Baksheev Mar 07 '19 at 09:34

3 Answers3

2

I see nothing in your question that would keep you from using plain tables with the according number of data columns. That's the most efficient form of storage by far. Smallest storage size, fastest queries.

Tables that are "never updated once created, but may be filtered" are hardly "dynamic". Unless you are withholding essential details that's all there is.

And unless there can be more than several 100 columns. See:

(But you later commented a maximum of 12, which is no problem at all.)

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
1

From what you've described, it sounds like a job for jsonb. Assuming name is unique in a certain table, I can imagine sth like this:

create table test (
  tableId integer,
  name text,
  data jsonb,
  constraint pk primary key(tableId, name)
);
insert into test values (1, 'movie1', '{"rating": 10, "name": "test"}');
insert into test values (1, 'movie2', '{"rating": 9, "name": "test2"}');
insert into test values (2, 'book1', '{"rank": 100, "name": "test", "price": 10}');
insert into test values (2, 'book2', '{"rank": 10, "name": "test", "price": 12}');

Basically the idea is to use tableId to identify each sub-table and store rows of the subtables in this one db table.

This opens some possibilities:

  • create a separate table to store metadata about each sub-table. For example, schema of the sub-tables could be stored here for application layer validation.
  • partial index on large/hot sub-tables: create index test_1_movie_name on test ((data->>'name')) where tableid = 1
blurrcat
  • 1,278
  • 13
  • 23
0

Dynamic column means schema less is the option we should look for . MongoDB is preferred. Are we storing as JSON ? If so Mongo will help manipulatin data / extracting / reporting will make life easier.

If you are not familiar with NOSQL . MSSQL 2016 onwards JSON storage in column is supported as varchar(MAX). SQL Server is providing the functions to deal with JSON data. Even though its a text based index by default for nvarchar . SQL supports computed column based indexing which will help to handle the elements look in JSON. Any number of non clustered index computed column is allowed which will ease the indexing to handle JSON data. SQL 2019 has more support for JSON

  • Postgres has better support for JSON than SQL Server. Postgres got native json type while SQL Server do it as a VARCHAR, hence indexing sucks. My thought is of any other nosql sb which can handle the requirement. Thanks for your input. –  Mar 03 '19 at 17:25
  • SQL Server - Computed column index will help to remove the text indexing.Couchbase is another NOSQL option if you are not with MongoDB. –  Mar 03 '19 at 20:51
  • How are the tables queried after creation? Why not issue DDL to create them when needed? – Mark R. Mar 04 '19 at 18:21
  • 1
    @Deena "Postgres got native json type while SQL Server do it as a VARCHAR, hence indexing sucks" Any specific example? There is nothing wrong with storing JSON as VARCHAR. To be fair [Oracle uses similar approach](https://oracle-base.com/articles/12c/indexing-json-data-in-oracle-database-12cr1). – Lukasz Szozda Mar 04 '19 at 21:04
  • @Lukaaz . Good to know about Oracle JSON support. Both Oracle/SQL Server are non native. But it does the Job well isn't . Provided the index is set . Explain plan will tell the answer. –  Mar 04 '19 at 21:28
  • 1
    @LukaszSzozda Without indexes, SQL Server has to perform a full table scan every time you query data, that is what suck about varchar types when comes to JSON. –  Mar 05 '19 at 08:45
  • @Deena Only when you need a part of column. The point is that data in column should be atomic(otherwise it violates 1NF). Getting part of column(specific node) could be handled by indexed computed column(still very fast) or full-text index. I agree with Oracle/Microsoft approach of not introducing additional data type like it was done with XML before. – Lukasz Szozda Mar 05 '19 at 20:40