7

Trying to write something to auto convert from some arbitrary DB result (i.e. not always all from table x), to an appropriate PHP typed result.

I extended the PDOStatement class,

class Statement extends PDOStatement {
    protected $pdo;
    protected $transformer;

    protected function __construct(PDO $pdo) {
        $this->pdo = $pdo;
        $this->transformer = $pdo->getTransformer();
    }

    public function fetchAll() {
        $results = parent::fetchAll(PDO::FETCH_ASSOC);

        if ($this->getTransformer()) $results = $this->completeResults($results);

        return $results;
    }

    private function completeResults(array $results = []) {
        if ($results == null || count($results) == 0) return null;
        if ($results[0] == false || !is_array($results[0])) return null;

        $index = 0;
        $typeMap = [];

        foreach ($results[0] as $column => $result) {
            $meta = $this->getColumnMeta($index); // this is very painful
            $typeMap[$column] = $meta['native_type'];
            $index++;
        }

        $transformer = $this->getTransformer();
        foreach ($results as $index => &$result) {
            array_walk($result, function(&$value, $key) use ($typeMap, $transformer) {
                $type = $typeMap[$key];
                $value = $transformer->transformToPhpValue($value, $type);
            });
        }

        return $results;
    }
}

Previously, before I was aware of PDO abstraction, I was using (in my specific case) the standard pg_...() methods. Using pg_field_type($resource, $column);, I could fetch the column type, and it was relatively speedy.

Now, using the new (for me) PDO method. If I comment out the part of my code where I do the transformation, and run 7 consecutive queries:

time to complete: 9.5367431640625E-7 seconds 
time to complete: 1.1920928955078E-6 seconds 
time to complete: 9.5367431640625E-7 seconds 
time to complete: 0 seconds 
time to complete: 9.5367431640625E-7 seconds 
time to complete: 0 seconds 
time to complete: 0 seconds

with it enabled:

time to complete: 0.5777850151062 seconds 
time to complete: 0.49124097824097 seconds 
time to complete: 0.28375911712646 seconds 
time to complete: 0.5946729183197 seconds 
time to complete: 0.42177200317383 seconds 
time to complete: 5.0067901611328E-6 seconds 
time to complete: 0.42121982574463 seconds 

That's /insane/.

I can tell it's fetching the column information one by one by one by looking at my Postgres logs:

LOG:  statement: SELECT TYPNAME FROM PG_TYPE WHERE OID=1114
LOG:  statement: SELECT TYPNAME FROM PG_TYPE WHERE OID=1114
LOG:  statement: SELECT TYPNAME FROM PG_TYPE WHERE OID=25
... like 30 more of these ...
LOG:  statement: SELECT TYPNAME FROM PG_TYPE WHERE OID=25
LOG:  statement: SELECT TYPNAME FROM PG_TYPE WHERE OID=23
LOG:  statement: SELECT TYPNAME FROM PG_TYPE WHERE OID=23
LOG:  statement: SELECT TYPNAME FROM PG_TYPE WHERE OID=23

The queries range in complexity from

SELECT 
    p.modified_at, ... ~ 30 fields ..., r.level AS id_level
FROM table_p AS p
LEFT JOIN table_a AS a ON (p.owner = a.id)
LEFT JOIN table_a0 AS a0 ON (p.reporter = a0.id)
LEFT JOIN table_r AS r ON (p.id = r.id)
WHERE (p.id = 1)

to just SELECT * FROM table_a AS a;

So, I guess the question is: is there a better way to do this? Is there a way that I can do this without impacting the speed of my code? 7 queries is on the low-end of consecutive queries run per request, so it's something that I'd like to deal with.

tadman
  • 208,517
  • 23
  • 234
  • 262
Tyler Sebastian
  • 9,067
  • 6
  • 39
  • 62
  • 1
    I believe as for now, you d better do it using with php as it s still experimental. Will be light year faster. – cpugourou Feb 26 '16 at 00:57
  • If you plan to use only Postgresql for your project I would strongly suggest you use the native pgsql library in place of PDO. PDO is slower and exposes less features. An answer for your type problem could be http://stackoverflow.com/questions/31643297/pg-query-result-contains-strings-instead-of-integer-numeric/31740990#31740990 – greg Feb 26 '16 at 10:09
  • @greg yeah that's what I was using before. Unfortunately it meant writing a lot of the functionality that PDO come packaged with... – Tyler Sebastian Feb 26 '16 at 22:12
  • @greg in addition much of the functionality was to be wrapped in dependency injection classes so explicitly using postgres was not an option. – Tyler Sebastian Feb 26 '16 at 22:18
  • I think you could write your query explicitly so that you won't have to loop and fetch each column datatype. A single SELECT statement using as many columns as you need and wrap each of those column up with a `pg_typeof()` function. – Kamil Gosciminski Feb 28 '16 at 01:28
  • @ConsiderMe interesting proposition, however, ideally, the whole system is DB agnostic - hence why I'm using PDOs. – Tyler Sebastian Feb 28 '16 at 02:35
  • This is why sometimes you have to pay for it with performance issues :-) You could just as well extract select fields from your query (still generic) and prepare such a query. – Kamil Gosciminski Feb 28 '16 at 10:53
  • @Tyler Sebastian I may be way of base here but what about selecting the data, collect the column names in the select statement, then using the databases information_schema get the columns meta data? both DB operations executed via the PDO abstraction classes. – David J Eddy Mar 03 '16 at 14:11
  • A 'hacky' way of doing this would be to create a "materialized view" through a though request/psql function that allows you to fetch the data you want. You'll gain significant speed. The main problem is that it's kind of a 'hack' and that you'll have to refresh this materialized view with every structure migration. But you'll have your performance in production environment. – Unex Mar 03 '16 at 15:53

1 Answers1

1

First of all, PDOStatement::getColumnMeta() is experimental, so be very careful using it (hope you'll setup autotests to check this with any php/pdo version update).

As for speed of retrieving metadata, I've run some tests and it turns out that SELECT TYPNAME FROM PG_TYPE WHERE OID=% queries are run blazing fast:

explain analyze SELECT TYPNAME FROM PG_TYPE WHERE OID=25;
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Index Scan using pg_type_oid_index on pg_type  (cost=0.27..8.29 rows=1 width=64) (actual time=0.051..0.055 rows=1 loops=1)
   Index Cond: (oid = 25::oid)
 Planning time: 0.165 ms
 Execution time: 0.100 ms
(4 rows)

explain analyze SELECT TYPNAME FROM PG_TYPE WHERE OID=1114;
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Index Scan using pg_type_oid_index on pg_type  (cost=0.27..8.29 rows=1 width=64) (actual time=0.083..0.085 rows=1 loops=1)
   Index Cond: (oid = 1114::oid)
 Planning time: 0.192 ms
 Execution time: 0.139 ms
(4 rows)

explain analyze SELECT TYPNAME FROM PG_TYPE WHERE OID=600;
                                                         QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
 Index Scan using pg_type_oid_index on pg_type  (cost=0.27..8.29 rows=1 width=64) (actual time=0.063..0.064 rows=1 loops=1)
   Index Cond: (oid = 600::oid)
 Planning time: 0.261 ms
 Execution time: 0.125 ms
(4 rows)

This is about 0.0001 seconds for PG to select that data, even adding up 30 of these will not sum in 0.5 second or something like that.

I'd recommend you run explain analyze on your server for pg_type queries and see what are the timings there.

Bet you are not using persistent connection to a DB, and that adds up a whole bunch of time to your metadata calls.

Evgeniy Chekan
  • 2,615
  • 1
  • 15
  • 23