I have database outputs like the following:
$row = '(one,"pika chu",,"")'
If I send this string as parameter to str_getcsv
it will output ['one', 'pika chu', '', '']
. The third element despite being absent has been turned into an empty string. This is very annoying since I must recognize empty values (no values) from empty strings. The output I would expect is ['one', 'pika chu', null, '']
.
The inputs I get are from a PostgreSQL database and are represented as composite values.
By example, if a table is pokemon_id => int4, name => text
then a query will output strings like '(1, "pika chu")'
. A unique constraint on the name field by example will allow the following two records to exist: (100, '')
and (101, null)
.
When fetched, they are formatted as raw values like:
'98,whatever'
'99,"pika chu"'
'100,""'
'101,'
'102,","'
I need to read those strings and this example must output the following arrays:
['98', 'whatever']
['99', 'pika chu']
['100', '']
['101', null]
['102', ',']
Is there a way to do that in PHP?
Update 1: @deceze kindly sent me this link stating there are no NULLs in CSV (TL;DR because there were no nulls in XML basically, this problem has been tackled since then.) How to parse CSV with NULLs then?
Update 2: I had propositions to create a dedicated parser in PHP using preg_match_*
functions. I am a bit reluctant to go that way because 1) of the performance impact compared to str_getcsv
and 2) the fact preg_match
used to segfault if the string passed was over 8kb (which can happen in a CSV context).
Update 3: I looked at str_getcsv
source code to see if it was possible to propose a patch to add parsing options like it is in some other languages. I now understand PHP’s underlying philosophy better. @daniel-vérité raised the idea to implement a state machine to parse CSV strings. Even though input can have thousands of lines that weight dozens of kilobytes with embedded CSV structures, it might be the best way.
Thank you for your help.