0

I wanted to make a new directory in Microsoft Word VB using a string from a paragraph, but it was the whole paragraph. It must have included an end line character so a path name error happened, using Mid, I stripped the last character from the range.text. Here's some code

Sub newDir()
    'Take the text in a paragraph selected by index and create a folder
    'MkDir ("C:\MainFolder\" & Mid(ActiveDocument.Paragraphs(2).Range.Text, 1, Len(ActiveDocument.Paragraphs(2).Range.Text) - 1))
End Sub 

This way works. Anyone know a better way to clean a string of such characters?

1 Answers1

2

Speaking more generally, you need to perform cleanup and remove not allowed chars from string which is going to be folder name. Once I faced the same trouble in the similar situation - save hundreds of emails from Outlook using Subjects as names. Below are 2 approaches, both use RegEx:

  • Create the list of unacceptable chars using the following pattern:

    RegX_NAChars.Pattern = "[\" & Chr(34) & "\!\@\#\$\%\^\&\*\(\)\=\+\|\[\]\{\}\`\'\;\:\<\>\?\/\,]"
    

Chr(34) is double quote, for obvious reasons it's impossible to list it directly. For your particular case you should add to the list so called "new line" chars as well using "[start of pattern]" & "\" & Chr(10) & "\" & Chr(13) & "[rest of pattern]". The order of chars in pattern does not matter.

  • The opposite approach is declare the list of allowed chars and remove all the rest:

    RegX_NAChars.Pattern = "[^\w \-.]"
    

The above pattern means all BUT latin letters, digits, space, dot and hyphen will be replaced / removed.

Both approaches have their own pros and cons, but but I think main reasons are:

  1. Disallowed chars should be used when the goal is to preserve original string as much as possible.
  2. Allowed chars list should be used when the name is not so important, but saving any file with no errors is the #1 goal.

This is the relevant piece of code for RegEx use:

    Dim RegX_NAChars As Object

    Set RegX_NAChars = CreateObject("VBScript.RegExp")
    RegX_NAChars.Pattern = [**use any of the above**]
    RegX_NAChars.IgnoreCase = True
    RegX_NAChars.Global = True
    ........[**your code**]........
    RegX_NAChars.Replace(String_to_Cleanup, "")

RegX_NAChars.Replace above will replace all matching chars with the defined replacement string. In my case it's "" - empty string, which means chars are thrown away. Replace it to anything (of course, these MUST be allowed for folder name chars as well - e.g. use _).

Read more about RegEx: http://www.jose.it-berater.org/scripting/regexp/regular_expression_syntax.htm

Ksenia
  • 497
  • 5
  • 14
Peter L.
  • 7,276
  • 5
  • 34
  • 53
  • In addition to the Regex link above, here is another [post](http://stackoverflow.com/q/22542834/2521004) with a few examples specific to Excel. – Automate This Mar 25 '14 at 03:45