0

I have a table containing chat messages that each contain text + attachments. These values are combined into an object and then serialized and stored as JSON text in the table. Each attachment consists of two strings: name and oid, the latter of which is a BASE64 encoded file path.

The original file paths can contain accented characters such as Privé.

Example message data, before and after BASE64 encoding:

text attachments (not BASE64 encoded) attachments (BASE64 encoded)
<p>Docs</p> [ {"name": "doc1.pdf", "path": "Privé\doc1.pdf"} ] [ {"name": "doc1.pdf", "oid": "UHJpdsOpXGRvYzEucGRm"} ]

I now want to decode all this in SQL. I got quite far, first using OPENJSON() to convert the stored JSON to a dataset that can be used for JOIN and SELECT, and then using some XML trickery to decode the BASE64 file paths.

Unfortunately I can't get the accented characters right: Privé becomes Privé, whatever I do (so far).

Sample SQL:

DECLARE     @messagesTable TABLE([Id] UniqueIdentifier, [Content] NVARCHAR(MAX))

INSERT INTO @messagesTable VALUES(
   'EF69067D-428B-8DA4-4FD8-0004CF6D28C2',
   N'{"text":"<p>Docs</p>","attachments":[{"name":"doc1.pdf","oid":"UHJpdsOpXGRvYzEucGRm"},{"name":"doc2.pdf","oid":"QWxnZW1lZW5cZG9jMi5wZGY="}]}'
)

SELECT      [M].[Id], [J].[name], [J].[oid],
            CONVERT(VARCHAR(MAX), CAST('' AS XML).value('xs:base64Binary(sql:column("[J].[oid]"))', 'VARBINARY(MAX)')) AS decodedOid
FROM        @messagesTable AS [M]
CROSS APPLY OPENJSON([M].[Content], '$.attachments') WITH ([name] NVARCHAR(MAX), [oid] NVARCHAR(MAX)) AS [J]

Result:

Id name oid decodedOid
EF69067D-428B-8DA4-4FD8-0004CF6D28C2 doc1.pdf UHJpdsOpXGRvYzEucGRm Privé\doc1.pdf
EF69067D-428B-8DA4-4FD8-0004CF6D28C2 doc2.pdf QWxnZW1lZW5cZG9jMi5wZGY= Algemeen\doc2.pdf

Sample DB Fiddle - it requires using a regular table instead of a table variable, otherwise the same:
http://sqlfiddle.com/#!18/413383/1

I tried using CONVERT(NVARCHAR(MAX), ... ) in the SELECT but that made the results become Chinese text or something like that.

I also tried various kinds of COLLATE, both after the CONVERT(...) and inside the WITH(... [oid] NVARCHAR(MAX)) part but it never seemed to change anything.

What change is needed to make this query return Privé\doc1.pdf for the first attachment?

Peter B
  • 22,460
  • 5
  • 32
  • 69
  • You can try to use this answer https://stackoverflow.com/a/50433415/8564731 The original TSQL function is located here https://stackoverflow.com/a/28412587/8564731 – Jeremy Fiel Jan 18 '23 at 14:28
  • Thanks, I tried the function from https://stackoverflow.com/a/68550176/ and it worked. – Peter B Jan 18 '23 at 14:54

0 Answers0