0

I am trying to parse some html code in access VBA. The html is in a record in a query, not in a document, and there are thousands of records.

<div  ><p  class="note spec-note"><div><p  >Lorem ipsum dolor sit amet, 
    consectetur adipiscing elit. Praesent feugiat gravida tincidunt. Cras 
    volutpat feugiat nisi eget vulputate. In tincidunt metus lorem, nec iaculis 
    mi semper ut.<p></div><div ><table><tr><th colspan="2"><span >General 
    Information</span></th></tr><tr><td >Brand Name:</td><td>Lorem<a  
    disabled="disabled" target="_blank"></a></td></tr><tr><td >Manufacturer:
    </td><td>Lorem<a  disabled="disabled" target="_blank"></a></td></tr><tr><td 
    >Manufacturer Part Number:</td><td>000-000-000<a  disabled="disabled" 
    target="_blank"></a></td></tr><tr><td >Marketing Information:</td><td>Lorem 
    ipsum dolor sit amet, consectetur adipiscing elit. Praesent feugiat gravida tincidunt. Cras volutpat feugiat nisi eget vulputate. In tincidunt metus lorem, nec iaculis mi semper ut. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Duis ultricies iaculis facilisis. Aliquam sed nulla varius felis consequat sodales. Ut pellentesque libero ex, non mollis lorem luctus bibendum. Mauris odio sapien, consequat a ultrices in, pellentesque vel enim. Ut laoreet commodo tortor convallis gravida.<a  disabled="disabled" target="_blank"></a></td></tr><tr><td >Package Type:</td><td>Bulk<a  disabled="disabled" target="_blank"></a></td></tr><tr><td >Product Name:</td><td>Lorem<a  disabled="disabled" target="_blank"></a></td></tr><tr><td >Product Type:</td><td>Ipsum<a  disabled="disabled" target="_blank"></a></td></tr></table><table><tr><th colspan="2"><span >Miscellaneous</span></th></tr><tr><td >Additional Information:</td><td><ul><li>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</li><li>Lorem ipsum dolor sit amet, consectetur adipiscing elit!</li><li>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</li><li>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</li><li>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</li><li>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</li><li>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</li></ul><a  disabled="disabled" target="_blank"></a></td></tr><tr><td >Certifications &amp; Standards:</td><td><ul><li>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</li></ul><a  disabled="disabled" target="_blank"></a></td></tr></table><table><tr><th colspan="2"><span >Physical Characteristics</span></th></tr><tr><td >Color:</td><td>Gray<a  disabled="disabled" target="_blank"></a></td></tr></table><table><tr><th colspan="2"><span >Technical Information</span></th></tr><tr><td >Ipsum</td><td>250 ft<a  disabled="disabled" target="_blank"></a></td></tr><tr><td >Cable Type:</td><td>Category 5e<a  disabled="disabled" target="_blank"></a></td></tr><tr><td >Conductor:</td><td>Copper<a  disabled="disabled" target="_blank"></a></td></tr></table><table><tr><th colspan="2"><span >Warranty</span></th></tr><tr><td >Limited Warranty:</td><td>Lifetime<a  disabled="disabled" target="_blank"></a></td></tr></table></div></div>

Here's where I am at with my VBA code so far, I am completely new to VBA.

Sub Extract_TD_text()

Dim db As dao.Database
Dim rs As dao.Recordset
Dim doc As HTMLDocument
Dim sku As String
Dim tbl As HTMLTable
Dim tds As IHTMLElementCollection
Dim td As HTMLTableCell
Dim r As Long


Set db = CurrentDb
Set rs = db.OpenRecordset("Query2")
Set doc = rs.Fields("Desc")
Set tds = doc.getElementsByTagName("td")

If Not (rs.EOF And rs.BOF) Then
    rs.MoveFirst
    Do Until rs.EOF = True
    r = 0
        For Each td In tds
         Debug.Print td
         r = r + 1
         Next
    Loop
End If

'ExitSub:
'    Set rs = Nothing
'    '..and set it to nothing
'    Exit Sub
'ErrorHandler:
'    Resume ExitSub
End Sub

I need to take a specific row, whose first column contains "Marketing Information" and the second is the actual info, out of each html table in the description field. Then place its inner text into the meta field. All the meta fields have to match with their respective sku fields. Also, it doesn't have to be VBA anything will help, but it has to be in access. I know it's a lot, please any help would be awesome.

Thank You!

Andre
  • 26,751
  • 7
  • 36
  • 80
Pang
  • 15
  • 5
  • This may help: https://stackoverflow.com/questions/11305/how-to-parse-xml-using-vba – Andre Jun 16 '17 at 08:26
  • I have actually seen something like this before, but it didn't workout for me. It's because the html is in a field in a query. – Pang Jun 19 '17 at 21:26
  • That doesn't matter, see the second answer. `objXML.loadXML(strXML)` loads a string, not a file. – Andre Jun 19 '17 at 22:44

0 Answers0