I have a selection of docx files stored as blob data in hexadecimal, I need to retrieve these so I can access the text within.
So far, I have converted the hex to string format with the following:
Dim blob = BLOB DATA
Dim con As String = String.Empty
For x = 2 To st.Length - 2 Step 2
con &= ChrW(CInt("&H" & st.Substring(x, 2)))
Next
However, if I then save the output from this as a .docx the file will not open because it is 'corrupt'. I presume that is why when I load this string into a memorystream and then try and use Novacode.DocX.Load(memoryStream) it gives me a similar corruption error.
I have tried splitting to byte array in two fashions, both give me different results.
System.Text.Encoding.Default.GetBytes(hex)
I have also tried.
Public Function HexToByteArray(hex As String) As Byte()
Dim upperBound As Integer = hex.Length \ 2
If hex.Length Mod 2 = 0 Then
upperBound -= 1
Else
hex = "0" & hex
End If
Dim bytes(upperBound) As Byte
For i As Integer = 2 To upperBound
bytes(i) = Convert.ToByte(hex.Substring(i * 2, 2), 16)
Next
Return bytes
End Function
I then tried converting them both to a memory stream and using them to create a DocX object like so:
Dim doc As DocX = DocX.Load(New MemoryStream(bytes))