8

I'm working on a Web project where the client application communicates with the DB via JSONs.

The initial implementation took place with SQL Server 2012 (NO JSON support and hence we implemented a Stored Function that handled the parsing) and now we are moving to 2016 (YES JSON support).

So far, we are reducing processing time by a significant factor (in some cases, over 200 times faster!).

There are some interactions that contain arrays that need to be converted into tables. To achieve that, the OPENJSON function does ALMOST what we need.

In some of these (array-based) cases, records within the arrays have one or more fields that are also OBJECTS (in this particular case, also arrays), for instance:

    [{
        "Formal_Round_Method": "Floor",
        "Public_Round_Method": "Closest",
        "Formal_Precision": "3",
        "Public_Precision": "3",
        "Formal_Significant_Digits": "3",
        "Public_Significant_Digits": "3",
        "General_Comment": [{
            "Timestamp": "2018-07-16 09:19",
            "From": "1",
            "Type": "Routine_Report",
            "Body": "[To + Media + What]: Comment 1",
            "$$hashKey": "object:1848"
        }, {
            "Timestamp": "2018-07-16 09:19",
            "From": "1",
            "Type": "User_Comment",
            "Body": "[]: Comment 2",
            "$$hashKey": "object:1857"
        }, {
            "Timestamp": "2018-07-16 09:19",
            "From": "1",
            "Type": "Routine_Report",
            "Body": "[To + Media + What]: Comment 3",
            "$$hashKey": "object:1862"
        }]
    }, {
        "Formal_Round_Method": "Floor",
        "Public_Round_Method": "Closest",
        "Formal_Precision": "3",
        "Public_Precision": "3",
        "Formal_Significant_Digits": "3",
        "Public_Significant_Digits": "3",
        "General_Comment": []

    }]

Here, General_Comment is also an array.

When running the command:

SELECT *
  FROM OPENJSON(@_l_Table_Data)
  WITH (    Formal_Round_Method                 NVARCHAR(16)    '$.Formal_Round_Method'               ,
            Public_Round_Method                 NVARCHAR(16)    '$.Public_Round_Method'               ,
            Formal_Precision                    INT             '$.Formal_Precision'                  ,
            Public_Precision                    INT             '$.Public_Precision'                  ,
            Formal_Significant_Digits           INT             '$.Formal_Significant_Digits'         ,
            Public_Significant_Digits           INT             '$.Public_Significant_Digits'         ,
            General_Comment                     NVARCHAR(4000)  '$.General_Comment'                   
        ) ;

[@_l_Table_Data is a variable holding the JSON string]

we are getting the column General_Comment = NULL even though the is data in there (at least in the first element of the array).

I guess that I should be using a different syntax for those columns that may contain OBJECTS and not SIMPLE VALUES, but I have no idea what that syntax should be.

halfer
  • 19,824
  • 17
  • 99
  • 186
FDavidov
  • 3,505
  • 6
  • 23
  • 59
  • you should think about your DB design. it seems like you should have 2 tables. one for comments and one for whatever the parent entity is. – Philipp Sander Jul 16 '18 at 07:23
  • Thank you @PhilippSander, but that direction would require the creation of several tens of new tables (I provided just one example) as well as many `JOIN`s to be added (performance). In the meantime, I found a very simple solution (lost the link to MS's page so I'm adding it as the answer to my original post). Cheers!!!! – FDavidov Jul 16 '18 at 07:30
  • those don't sound like valid arguments to me, but it's your choice. you can handle the performance by using other tools that your DB provides – Philipp Sander Jul 16 '18 at 07:42
  • 1
    @PhilippSander, it is a very long time I realized that people have different strategies, approaches and priorities when designing systems. One of mine is to keep the result as simple as possible without damaging flexibility and maintainability of the result. In this context, keeping the total number of tables as small as possible is one of my preferred techniques. Still, this is very much a matter of taste and hence, what is suitable (and valid) for me does not necessarily have to be suitable (or valid) for others. – FDavidov Jul 16 '18 at 08:19
  • that's fair. As I said it's your choice. I just wanted to point it out – Philipp Sander Jul 16 '18 at 09:58

2 Answers2

15

I found a Microsoft page that actually solves the problem.

Here is how the query should look like:

SELECT *
  FROM OPENJSON(@_l_Table_Data)
  WITH (    Formal_Round_Method        NVARCHAR(16)    '$.Formal_Round_Method'               ,
            Public_Round_Method        NVARCHAR(16)    '$.Public_Round_Method'               ,
            Formal_Precision           INT             '$.Formal_Precision'                  ,
            Public_Precision           INT             '$.Public_Precision'                  ,
            Formal_Significant_Digits  INT             '$.Formal_Significant_Digits'         ,
            Public_Significant_Digits  INT             '$.Public_Significant_Digits'         ,
            General_Comment            NVARCHAR(MAX)   '$.General_Comment'   AS JSON                
    ) ;

So, you need to add AS JSON at the end of the column definition and (God knows why) the type MUST be NVARCHAR(MAX).

Very simple indeed!!!

FDavidov
  • 3,505
  • 6
  • 23
  • 59
0

Create Function ParseJson:

Create or Alter FUNCTION [dbo].[ParseJson]  (@JSON NVARCHAR(MAX))
RETURNS @Unwrapped TABLE
  (
  [id] INT IDENTITY, --just used to get a unique reference to each json item
  [level] INT, --the hierarchy level
  [key] NVARCHAR(100), --the key or name of the item
  [Value] NVARCHAR(MAX),--the value, if it is a null, int,binary,numeric or string
  type INT, --0 TO 5, the JSON type, null, numeric, string, binary, array or object
  SQLDatatype sysname, --whatever the datatype can be parsed to
  parent INT, --the ID of the parent
  [path] NVARCHAR(4000) --the path as used by OpenJSON
  )
AS begin
INSERT INTO @Unwrapped ([level], [key], Value, type, SQLDatatype, parent,
[path])
VALUES
  (0, --the level
   NULL, --the key,
   @json, --the value,
   CASE WHEN Left(ltrim(@json),1)='[' THEN 4 ELSE 5 END, --the type
   'json', --SQLDataType,
   0 , --no parent
   '$' --base path
  );
DECLARE @ii INT = 0,--the level
@Rowcount INT = -1; --the number of rows from the previous iteration
WHILE @Rowcount <> 0 --while we are still finding levels
  BEGIN
    INSERT INTO @Unwrapped ([level], [key], Value, type, SQLDatatype, parent,
    [path])
      SELECT [level] + 1 AS [level], new.[Key] AS [key],
        new.[Value] AS [value], new.[Type] AS [type],
-- SQL Prompt formatting off
/* in order to determine the datatype of a json value, the best approach is to a determine
the datatype that can be parsed. It JSON, an array of objects can contain attributes that arent
consistent either in their name or value. */
       CASE
        WHEN new.Type = 0 THEN 'bit null'
        WHEN new.[type] IN (1,2)  then COALESCE(
            CASE WHEN TRY_CONVERT(INT,new.[value]) IS NOT NULL THEN 'int' END,
            CASE WHEN TRY_CONVERT(NUMERIC(14,4),new.[value]) IS NOT NULL THEN 'numeric' END,
            CASE WHEN TRY_CONVERT(FLOAT,new.[value]) IS NOT NULL THEN 'float' END,
          CASE WHEN TRY_CONVERT(MONEY,new.[value]) IS NOT NULL THEN 'money' END,
            CASE WHEN TRY_CONVERT(DateTime,new.[value],126) IS NOT NULL THEN 'Datetime2' END,
          CASE WHEN TRY_CONVERT(Datetime,new.[value],127) IS NOT NULL THEN 'Datetime2' END,
          'nvarchar')
       WHEN new.Type = 3 THEN 'bit'
       WHEN new.Type = 5 THEN 'object' ELSE 'array' END AS SQLDatatype,
        old.[id],
        old.[path] + CASE WHEN old.type = 5 THEN '.' + new.[Key]
                       ELSE '[' + new.[Key] COLLATE DATABASE_DEFAULT + ']' END AS path
-- SQL Prompt formatting on
      FROM @Unwrapped old
        CROSS APPLY OpenJson(old.[Value]) new
          WHERE old.[level] = @ii AND old.type IN (4, 5);
    SELECT @Rowcount = @@RowCount;
    SELECT @ii = @ii + 1;
  END;
  return
END

For Usage:

select * from ParseJson(jsonString)