0

I read a question here about converting html into Excel and it works but with unsolved issue. If the html contains Arabic letters, as it not displayed correctly in Excel after the import.

  • The code the I copied to do the import is below
  • I tried to fix the charset issue using the below code but I had runtime error HTML_Content.Charset = "utf-8" (in the below section 'Create HTMLFile Object)
Sub HTML_Table_To_Excel()

    Dim htm As Object
    Dim Tr As Object
    Dim Td As Object
    Dim Tab1 As Object
    Dim file As String

    'Replace the file path with your own
    file = "c:\your_File.html"

    'Determine the next file number available for use by the FileOpen function
    TextFile = FreeFile

    'Open the text file
    Open file For Input As TextFile

    'Create HTMLFile Object
    Set HTML_Content = CreateObject("htmlfile")
    HTML_Content.body.innerHTML = Input(LOF(TextFile), TextFile)

    Column_Num_To_Start = 1
    iRow = 2
    iCol = Column_Num_To_Start
    iTable = 0

    'Loop Through Each Table and Download it to Excel in Proper Format
    For Each Tab1 In HTML_Content.getElementsByTagName("table")
        
        With HTML_Content.getElementsByTagName("table")(iTable)
            
            For Each Tr In .Rows
                
                For Each Td In Tr.Cells
                    Sheets(1).Cells(iRow, iCol).Select
                    Sheets(1).Cells(iRow, iCol) = Td.innerText
                    iCol = iCol + 1
                Next Td
                
                iCol = Column_Num_To_Start
                iRow = iRow + 1
            Next Tr
            
        End With

        iTable = iTable + 1
        iCol = Column_Num_To_Start
        iRow = iRow + 1
        
    Next Tab1

    MsgBox "Process Completed"
End Sub
Community
  • 1
  • 1

1 Answers1

0

charSet is a read-only property. That is likely the source of your error if attempting to set via this property.

I think you might want:

HTML_Content.defaultCharset = "UTF-8"

I would prefer to declare early bound MSHTML.HTMLDocument, by VBE > Tools >References > Microsoft HTML Object Library reference, then something like html.defaultCharset = "UTF-8" where html is your instance of the object.

I really dislike htmlFile as it is so limited and only useful when you absolutely cannot add project references.

Also, please declare all your variables and use Option Explicit at top of modules.

QHarr
  • 83,427
  • 12
  • 54
  • 101
  • Thanks for the TIP, but after trying the above the issue still exist – Amr El Shabini Jan 17 '21 at 09:19
  • Can edit relevant html to your question and detail your system settings? I doubt I will be able to reproduce based on my Excel region settings as is. – QHarr Jan 17 '21 at 09:34
  • https://stackoverflow.com/questions/53123782/unicode-characters-as-a-result-of-xmlhttp-request and https://stackoverflow.com/questions/52821687/unicode-characters-incorrect-when-parsing-json maybe something there? My answer should have solved your error message. – QHarr Jan 17 '21 at 09:39