6

So, I have a string and I want to remove the e-mail adress from it if there is one.

As example:

This is some text and it continues like this
until sometimes an email adress shows up asd@asd.com

also some more text here and here.

I want this as a result.

This is some text and it continues like this
until sometimes an email adress shows up [email_removed]

also some more text here and here.

cleanFromEmail(string)
{
    newWordString = 
    space := a_space
    Needle = @
    wordArray := StrSplit(string, [" ", "`n"])
    Loop % wordArray.MaxIndex()
    {

        thisWord := wordArray[A_Index]


        IfInString, thisWord, %Needle%
        {
            newWordString = %newWordString%%space%(email_removed)%space%
        }
        else
        {
            newWordString = %newWordString%%space%%thisWord%%space%
            ;msgbox asd
        }
    }

    return newWordString
}

The problem with this is that I end up loosing all the line-breaks and only get spaces. How can I rebuild the string to look just like it did before removing the email-adress?

Harpo
  • 187
  • 1
  • 10

2 Answers2

4

That looks rather complicated, why not use RegExReplace instead?

string =
(
This is some text and it continues like this
until sometimes an email adress shows up asd@asd.com

also some more text here and here.
)

newWordString := RegExReplace(string, "\S+@\S+(?:\.\S+)+", "[email_removed]")

MsgBox, % newWordString

Feel free to make the pattern as simple or as complicated as you want, depending on your needs, but RegExReplace should do it.

CertainPerformance
  • 356,069
  • 52
  • 309
  • 320
  • Thank you alot. The problem with this is that for some reason it removes more than just the email. it removes all of this "asd@asd.com also". So it also removes the line-break and the next word after. – Harpo Feb 20 '19 at 09:46
  • 2
    I cannot reproduce the problem, are you sure you copied the code exactly? The trailing `\S+` should *only* match non-whitespace characters. When I test against a string of ``string := "foo bar asd@asd.com`nbaz"``, I get `foo bar [email_removed]\nbaz`, where `\n` in the second is a literal newline character. – CertainPerformance Feb 20 '19 at 09:51
  • 1
    matching emails is more complex then that, look here for instance: https://emailregex.com/ Specifically if you want to be able to match deeper server addresses (XXX@mail.yahoo.com for instance) you're gonna have to allow repetitions in the middle (\S+\.) part – Veltzer Doron Feb 20 '19 at 10:24
0

If for some reason RegExReplace doesn't always work for you, you can try this:

text =
(
This is some text and it continues like this
until sometimes an email adress shows up asd@asd.com.

also some more text here and here.
)

MsgBox, % cleanFromEmail(text)

cleanFromEmail(string){
    lineArray := StrSplit(string, "`n")
    Loop % lineArray.MaxIndex()
    {
        newLine := ""
        newWord := ""
        thisLine := lineArray[A_Index]
        If InStr(thisLine, "@")
        {
            wordArray := StrSplit(thisLine, " ")
            Loop % wordArray.MaxIndex()
            {
                thisWord := wordArray[A_Index]
                {
                    If InStr(thisWord, "@")
                    {
                        end := SubStr(thisWord, 0)
                        If end in ,,,.,;,?,!
                            newWord := "[email_removed]" end ""
                        else
                            newWord := "[email_removed]"
                    }
                    else
                        newWord := thisWord
                }
                newLine .= newWord . " " ; concatenate the outputs by adding a space to each one
            }
            newLine :=  trim(newLine) ; remove the last space from this variable
        }
        else
            newLine := thisLine
        newString .= newLine . "`n"
    }
    newString := trim(newString)
    return newString
}
user3419297
  • 9,537
  • 2
  • 15
  • 24