I have a table containing chat messages that each contain text + attachments. These values are combined into an object and then serialized and stored as JSON text in the table. Each attachment consists of two strings: name
and oid
, the latter of which is a BASE64 encoded file path.
The original file paths can contain accented characters such as Privé
.
Example message data, before and after BASE64 encoding:
text | attachments (not BASE64 encoded) | attachments (BASE64 encoded) |
---|---|---|
<p>Docs</p> | [ {"name": "doc1.pdf", "path": "Privé\doc1.pdf"} ] | [ {"name": "doc1.pdf", "oid": "UHJpdsOpXGRvYzEucGRm"} ] |
I now want to decode all this in SQL. I got quite far, first using OPENJSON() to convert the stored JSON to a dataset that can be used for JOIN and SELECT, and then using some XML trickery to decode the BASE64 file paths.
Unfortunately I can't get the accented characters right: Privé
becomes Privé
, whatever I do (so far).
Sample SQL:
DECLARE @messagesTable TABLE([Id] UniqueIdentifier, [Content] NVARCHAR(MAX))
INSERT INTO @messagesTable VALUES(
'EF69067D-428B-8DA4-4FD8-0004CF6D28C2',
N'{"text":"<p>Docs</p>","attachments":[{"name":"doc1.pdf","oid":"UHJpdsOpXGRvYzEucGRm"},{"name":"doc2.pdf","oid":"QWxnZW1lZW5cZG9jMi5wZGY="}]}'
)
SELECT [M].[Id], [J].[name], [J].[oid],
CONVERT(VARCHAR(MAX), CAST('' AS XML).value('xs:base64Binary(sql:column("[J].[oid]"))', 'VARBINARY(MAX)')) AS decodedOid
FROM @messagesTable AS [M]
CROSS APPLY OPENJSON([M].[Content], '$.attachments') WITH ([name] NVARCHAR(MAX), [oid] NVARCHAR(MAX)) AS [J]
Result:
Id | name | oid | decodedOid |
---|---|---|---|
EF69067D-428B-8DA4-4FD8-0004CF6D28C2 | doc1.pdf | UHJpdsOpXGRvYzEucGRm | Privé\doc1.pdf |
EF69067D-428B-8DA4-4FD8-0004CF6D28C2 | doc2.pdf | QWxnZW1lZW5cZG9jMi5wZGY= | Algemeen\doc2.pdf |
Sample DB Fiddle - it requires using a regular table instead of a table variable, otherwise the same:
http://sqlfiddle.com/#!18/413383/1
I tried using CONVERT(NVARCHAR(MAX), ... )
in the SELECT
but that made the results become Chinese text or something like that.
I also tried various kinds of COLLATE, both after the CONVERT(...)
and inside the WITH(... [oid] NVARCHAR(MAX))
part but it never seemed to change anything.
What change is needed to make this query return Privé\doc1.pdf
for the first attachment?