The correct way to fix this issue is to fix the XML at its source. If, however for any reason this is not possible then you can probably fix the xml via a string splitter function and basic string manipulation. This method does assume a relatively simple xml and probably won’t work well for large or complex xml strings.
First you will have to create a string splitter function, there are plenty of example here on SO but I have included an example below for completeness:
CREATE FUNCTION [dbo].[SplitString]
(
@string NVARCHAR(MAX),
@delimiter CHAR(1)
)
RETURNS @output TABLE(splitdata NVARCHAR(MAX)
)
BEGIN
DECLARE @start INT, @end INT
SELECT @start = 1, @end = CHARINDEX(@delimiter, @string)
WHILE @start < LEN(@string) + 1 BEGIN
IF @end = 0
SET @end = LEN(@string) + 1
INSERT INTO @output (splitdata)
VALUES(SUBSTRING(@string, @start, @end - @start))
SET @start = @end + 1
SET @end = CHARINDEX(@delimiter, @string, @start)
END
RETURN
END
Next split the xml string into multiple rows for each occurrence of the assignment operator ‘=’. Then find the occurrence of the values assigned to attributes using pattern matching functions and replace values with quoted values. The final step in the query joins the split xml string back into a single xml.
DECLARE @malformedXmlString NVARCHAR(MAX) = '<test A=B width = 1000 height= 800 priority =high name="fred" />' --'<test A=BCD>DATA<\test>'
DECLARE @xmlSplit TABLE
(
ID INT IDENTITY
,splitdata NVARCHAR(MAX)
)
INSERT INTO @xmlSplit
(
splitdata
)
SELECT LTRIM(RTRIM(splitdata)) AS splitdata
FROM [dbo].[SplitString](@malformedXmlString, '=')
UPDATE @xmlSplit
SET splitdata = UpdatedXml.splitdata
FROM @xmlSplit OrginalXml
INNER JOIN (
SELECT ID
-- Use the PATINDEX function to determine the position in the string where the attribute values end. Replace value with quoted version of value.
,REPLACE(splitdata
,LTRIM(RTRIM(LEFT(splitdata, PATINDEX('%[ />]%', splitdata) -1)))
,'"' + LTRIM(RTRIM(LEFT(splitdata, PATINDEX('%[ />]%', splitdata) -1))) + '"') AS splitdata
FROM @xmlSplit
WHERE splitdata LIKE '[a-zA-Z0-9]%[ />]%' -- Only return occurrences of string which start with an alpha numeric character and ends with a space, ‘/’ or ‘>’ character. This should be your value of the attribute we split the string on.
) UpdatedXml ON OrginalXml.ID = UpdatedXml.ID
DECLARE @xmlString NVARCHAR(MAX);
SELECT @xmlString = COALESCE(@xmlString + '=', '') + CONVERT(NVARCHAR(MAX), splitdata)
FROM @xmlSplit
SELECT CAST(@xmlString AS XML)