0

My goal is to extract all email addresses from the Word.ActiveDocument and put them into one single cell in the Excel Sheet.

The code is run from Excel VBA editor. It needs to search for email addresses, extract them from the document and fill the Excel cell Activesheet.Range("C31"). Only one cell is available, no matter how many email addresses have been found.

The addresses found need to be delimited using ", " the coma and the space.

I'm trying to do this by finding @ in the document and then building up the range forward and backwards to have all the email address in the range variable. Building the address to the right was quite easy using rng.MoveEndUntil Cset:="," because in my document there is always a coma after the email address.

But how to get the missing left side of the email address into the range variable?? I've used rng.MoveStart Unit:=wdWord, Count:=-1 but what if the email will be romek.zjelonek@wp.com or grawer.best@yahoo.com It will not work.

This is what I have now.

Sub FindEmail035()         '[0-9;A-z;,._-]{1;}\@[0-9;A-z;._-]{1;}
                           '[0-9;A-z;,._-]{1;}\@[0-9;A-z;._-]{1;}
Dim WordApp As Word.Application
Dim WordDoc As Word.Document
Dim ExcelApp As Excel.Application

Dim rng As Word.Range
Dim emailAdr As String
Dim ws As Worksheet

Set WordApp = GetObject(, "Word.Application")
Set ExcelApp = GetObject(, "Excel.Application")
Set WordDoc = WordApp.ActiveDocument
Set rng = WordApp.ActiveDocument.Content
Set ws = ExcelApp.ActiveSheet

ExcelApp.Application.Visible = True

    With rng.Find
        .Text = "@"
        .Wrap = wdFindAsk
        .Forward = True
        .MatchWildcards = False
        .Execute

        Debug.Print rng.Text
        If .Found = True Then
            'rng.Expand (wdWord)
            'Debug.Print rng.Text
            rng.MoveStart Unit:=wdWord, Count:=-1
            Debug.Print rng.Text
            rng.MoveEndUntil Cset:=","
            'rng.MoveEnd Unit:=wdWord, Count:=1
            'rng.MoveEndUntil Cset:=" ", Count:=wdBackward
        End If
   End With     'how to create loop that will extract all the email addresses in the document??
   ws.Range("C31").Value = rng

End Sub

What loop should I use to get the number of mails present in the document and later build up the ranges with email addresses inside?

This is the place in the document where the mail addresses reside.

enter image description here

BigBen
  • 46,229
  • 7
  • 24
  • 40
michal roesler
  • 479
  • 2
  • 9
  • 26

2 Answers2

2

You're on the right track. The easiest thing here is to move the start of the range with .MoveStartUntil Cset:=" " Count:=wdBackward so that you move back through the range until you hit the space before the email address. That is of course assuming consistent formatting and no arbitrary spaces.

I would also just search through the ActiveDocument.Content and then Set rng every time .Found = True because you don't want it overriding your range (which it does when searching a range). Or Dim a new range srchRng or something and then set that to the found results.

 With rng.Find
        .Text = "@"
        .Wrap = wdFindAsk
        .Forward = True
        .MatchWildcards = False
        .Execute

        Debug.Print rng.Text
        If .Found = True Then
            rng.MoveStartUntil Cset:=" ", Count:=wdBackward
            rng.MoveEndUntil Cset:=","
        End If
jclasley
  • 668
  • 5
  • 14
  • Accepted. Yes we have consistent formatting which we can rely on. Space in front of email and a coma right after it. – michal roesler Feb 10 '20 at 13:49
  • Can You please give me a hint about what loop should I use to iterate through all ```@``` found in the document. – michal roesler Feb 10 '20 at 13:59
  • 1
    Edited my answer to include code. If you are talking about how to manipulate/store all the email addresses, there are many ways you can do it. You can manipulate them in-line, you can add them to an array and loop through them later, or you can concatenate a string that contains all of the email addresses. It all depends what your'e doing with them later. Probably the most universally accepted/safe way would be to add them to an array, because if your end-result or process ever changes, you can change the manipulation you do on the array much more easily than rewriting a loop. – jclasley Feb 10 '20 at 14:13
  • I still don't understand when (with which line) the code you provided starts to look for the second (or third) email address. I chose "the concatenate a string that contains all of the email addresses" method, because in result I get one string, which then I can paste into my destination cell. – michal roesler Feb 10 '20 at 14:27
  • 1
    It's a find, so it will keep going with the find until there are no more `@` left. However, I suggest you change `.Wrap = wdFindAsk` to `.Wrap = wdFindContinue` so that it will go through the whole document. On the inside of your `If` statement, add in the string you're concatenating. – jclasley Feb 10 '20 at 14:56
  • No @jclasley. I'm debugging using F8 and Immediate window and I can see, this find method is just finishing search after it finds and constructs a first whole email address. But I'll write another question and explain what I mean. – michal roesler Feb 10 '20 at 15:05
0

Assuming the email addresses are plain text, you can use Word VBA code like:

Sub Demo()
Dim StrOut As String
With ActiveDocument.Range
  With .Find
    .ClearFormatting
    .Replacement.ClearFormatting
    .Text = "<[0-9A-ÿ.\-]{1,}\@[0-9A-ÿ\-.]{1,}([^13 -/\:-\@\\-`\{-¿])"
    .Replacement.Text = ""
    .Forward = True
    .Format = False
    .Wrap = wdFindStop
    .MatchWildcards = True
    .Execute
  End With
  Do While .Find.Found
    StrOut = StrOut & Trim(.Text) & " "
    .Collapse wdCollapseEnd
    .Find.Execute
  Loop
End With
StrOut = Replace(Trim(StrOut), " ", "; ")
MsgBox StrOut
End Sub
macropod
  • 12,757
  • 2
  • 9
  • 21