3

I'm working on an API right now and I want to store usage logs. My API has a few active versions right now and in logs I want to be able to sort and search by versions.

Right now versions are saved as strings (varchars) in a vX.X.XX format (starting from v0.0.16) and therefore can't be searched by where api_version > "vX.X.XX" query. Therefore it's not the best solution for my case and I want to switch to something numeric.

Versions as varchars can be sorted by Postgresql, but it's pretty slow.

In python version is created as a tuple (e.g. (1, 0, 0)) and therefore can be easily sorted.

Are there any best ways to store versions in PostgreSQL? What would be the best type to do that and that would have a decent speed performance in sort and search?

Aleks Kuznetsov
  • 457
  • 1
  • 7
  • 23

6 Answers6

5

I think, a simple solution is to try array as the data type for your version field. In your case it could be int[].

An example of a table with version:

create table tbl (
    id serial not null primary key,
    "version" int[] null
);

After that you can filter, order or group by version as you'd do with an array.

select * from tbl where "version" > array[2, 3, 10] order by "version";
Fomalhaut
  • 8,590
  • 8
  • 51
  • 95
4

The easiest way to store version numbers in a database is to just keep separate columns for the major, minor, etc. components, e.g.

major | middle | minor
1     | 0      | 0

This would make it very easy to search for a given version's logs, e.g.

SELECT *
FROM logs
WHERE
    major = 1 AND
    middle = 0 AND
    minor = 0;

Note that this approach uses a query which is sargable with regard to the version numbers. That is, an index on the version columns can potentially be used to speed up the above query.

The problem with storing a version string is that it is much harder to search for individual version components. And also it is unlikely that any index could be used most of the time. You may still store the version string, just don't search using it.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
4

Not sure if Varchar is the best solution, if you could change field type to numeric and re-save existing number in indexed field, e.g. vX.Y.ZZ as (x*10000 + y*1000 + z*10)

Another way is casting your string to numerics and then sorting - as described here

Pogromche
  • 51
  • 5
  • 1
    Thank you for the idea, but it might become an issue in a few cases. E.g. v0.0.16 = 160 or v1.10.1 = 20010. But thank you again! Casting to sort is not the optimal solution, I think. – Aleks Kuznetsov Dec 09 '19 at 11:12
  • 1
    Alex, as a solution, to avoid issues, you can increase multipliers, like x*100000 + y*1000 + z – Pogromche Dec 09 '19 at 11:25
2

I haven't found an included type that would work easily for my case, so I've decided to create my own type.

create type version as (
   major int,
   middle int,
   minor int
)

Usage is pretty simple:

INSERT INTO table (version) VALUES (ROW(X,Y,Z));

SELECT * FROM table WHERE version > ROW(X,Y,Z) ORDER BY version;

In this case I can easily store already created python tuples, which I use for versions.

Aleks Kuznetsov
  • 457
  • 1
  • 7
  • 23
2

The closest equivalent to tuples in Postgres are arrays. I'm pretty sure this does what you want:

  • equality comparisons
  • inequalities
  • ordering

Here is an example:

with t as (
      select *
      from (values (array[1, 2, 3]),
                   (array[10, 15]),
                   (array[1, 2, 3, 4, 5]),
                   (array[1, 2, 4, 3])
           ) v(version)
     )
select t1.version, t2.version, t1.version < t2.version, t1.version = t2.version
from t t1 cross join
     t t2
order by t1.version desc, t2.version asc;

I should add: You can readily convert these to strings using array_to_string(version, '.').

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

Well you actually can make the comparison like where "api_version > 'vX.X.XX" - well sort-of. The function regexp_split_to_array can be used to convert the version numbers to an integer array (need to lose the "v" ). Postgres can the compare integer arrays using the regular comparison operators.

-- fails to sort version properly and gets version incorrect 
with versions (name, version) as 
   ( values ('first', 'v0.0.1')
          , ('second', 'v1.0.1')
          , ('third', 'v1.2.1')
          , ('forth','v1.10.1')
   )
select name, substring(version,2) as version 
     , version >  'v1.3.5' "> v1.3.5"
  from versions 
 order by 2;

-- properly sorts version and properly identifies the version order
with versions (name, version) as    
   ( values ('first', 'v0.0.1')
          , ('second', 'v1.0.1')
          , ('third', 'v1.2.1')
          , ('forth','v1.10.1')
   )
select name, regexp_split_to_array(ver, '(\.)')::int[] as version
     ,  regexp_split_to_array(ver, '(\.)')::int[] >  regexp_split_to_array('1.3.5', '(\.)')::int[]  "> v1.3.5"
  from (select name, substring(version,2) as ver  
          from versions 
       ) v
 order by 2;

Now for actual version comparisons:

with versions (name, version) as 
   ( values ('first', 'v0.0.1')
          , ('second', 'v1.0.1')
          , ('third', 'v1.2.1')
          , ('forth','v1.10.1')
   )
   , target (version)  as
     ( values ('v1.10.0') )
select name, version
  from ( select name, regexp_split_to_array(ver, '(\.)')::int[] as version
           from (select name, substring(version,2) as ver  
                  from versions 
                ) v
       ) v2
where version > (select regexp_split_to_array( (substring(version,2))::text , '(\.)')::int[] from target) ;  

The complexity for above comes from having to deal with the "v" in your version number. Without that this reduces to:

with versions (name, version) as 
   ( values ('first', '0.0.1')
          , ('second', '1.0.1')
          , ('third', '1.2.1')
          , ('forth','1.10.1')
   )
   , target     as
     ( select regexp_split_to_array( '1.10.0', '(\.)')::int[] as version)
select v.name, v.version 
  from (select name, regexp_split_to_array(version, '(\.)')::int[] as version from versions) v 
 where v.version > (select t.version from target t) ;
Belayer
  • 13,578
  • 2
  • 11
  • 22