I have approximately 800+ word documents from which I am attempting to generate a master Table containing all the fields (not an actual word field) in the document.
An example of such documents are shown below:
From this the desired table would be something like this:
Once I have this I can tidy it up in Power query and pivot.
Most importantly for my use case here, I need to know all the variations of Header Names which can occur. That being said, Knowing how to extract the complete table from word docs in batch to excel would be helpful in the future.
SOLUTION --> Update 3 (See deleted for previous updates): This very messy code starts to iterate over a selected folder location and generates the data but looses some of the formattings of the previous update(Minor issue) This is okay as It solves the major issue.
Sub CopyTables()
Dim oWord As Word.Application
Dim WordNotOpen As Boolean
Dim oDoc As Word.Document
Dim oTbl As Word.Table
Dim fd As Office.FileDialog
Dim FilePath As String
Dim wbk As Workbook
Dim wsh As Worksheet
Dim diaFolder As FileDialog
Dim selected As Boolean
Dim strFile As String
Dim pdfPath As String
Dim oFSO As Object
Dim oFolder As Object
Dim oFile As Object
Dim i As Integer
Dim n As Long
'Create New Workbook
Set wbk = Workbooks.Add(Template:=xlWBATWorksheet)
' Get Folder location from User
Set diaFolder = Application.FileDialog(msoFileDialogFolderPicker)
With diaFolder
diaFolder.AllowMultiSelect = False
selected = diaFolder.Show
If selected Then
FilePath = diaFolder.SelectedItems(1)
Debug.Print FilePath
Set diaFolder = Nothing
Else
Beep
End If
End With
On Error Resume Next
Application.ScreenUpdating = False
Set oFSO = CreateObject("Scripting.FileSystemObject")
Set oFolder = oFSO.GetFolder(FilePath)
'Get the file Names
For Each oFile In oFolder.Files
Debug.Print oFolder.Path
Debug.Print oFile.Name
Debug.Print oFolder.Path & "\" & oFile.Name
FilePath = oFolder.Path & "\" & oFile.Name
'------------------------------------------------------------------
'Get or start Word
Set oWord = GetObject(Class:="Word.Application")
If Err Then
Set oWord = New Word.Application
WordNotOpen = True
End If
'On Error GoTo Err_Handler
'-------------------------------------------------------------------
' Open document
Set oDoc = oWord.Documents.Open(FilePath)
' Loop through the tables
'Set wsh = wbk.Worksheets.Add(After:=wbk.Worksheets(wbk.Worksheets.Count))
For Each oTbl In oDoc.Tables
' Create new sheet
'Set wsh = wbk.Worksheets.Add(After:=wbk.Worksheets(wbk.Worksheets.Count))
'-------------------------------------------------------------------
'NEED ASSISTANCE HERE APPEND TO BELOW CELL WITH BLANK SPACE
' Copy/paste the table
oTbl.Range.Copy
Sheets("Sheet1").Range("B" & Rows.Count).End(xlUp).Offset(1, 0).PasteSpecial xlPasteValues
Next oTbl
'------------------------------------------------------------------
' Delete the first sheet
'Application.DisplayAlerts = False
'wbk.Worksheets(1).Delete
'Application.DisplayAlerts = True
'------------------------------------------------------------------
Exit_Handler:
On Error Resume Next
oDoc.Close SaveChanges:=False
If WordNotOpen Then
oWord.Quit
End If
'------------------------------------------------------------------
'Release object references
Set oTbl = Nothing
Set oDoc = Nothing
Set oWord = Nothing
Application.ScreenUpdating = True
'------------------------------------------------------------------
'Err_Handler:
'MsgBox "Word caused a problem. " & Err.Description, vbCritical, "Error: " & Err.Number
'Resume Exit_Handler
Next oFile
End Sub
In this code, each line of the document is captured in its own cell. This is okay and may actually be useful.
Will update here with any progress.
What im really not sure how is how to ensure it appends to the previous table.
The end product of this after running 800+ documents will be ridiculous but as I say I can clean up in PQ.