0

There is a third party tool I would like to use that bulk import's a number of files into SharePoint and associates them with users. It needs a regular expression to do the conversion between file name and account name.

Our files are in the format firstname surname or firstname middlename surname. Our Account names are in the form firstname then first inital of the surname.

So to illustrate:

foo bar -> foob
foo bar qux -> fooq

On searching I did find some examples of obtaining the first letter of last words but they all utilised functions within code in combination with regex, but that isn't available to me in this case. Is it possible to do this with regex alone?

garyh
  • 2,782
  • 1
  • 26
  • 28
Graeme Smith
  • 3
  • 1
  • 5
  • On one side it depends of the third party tool how it uses the matches. It may take the first one, or concat all matches or anything else... Even if it concat all matches (that I think its the option you need) I'm not sure its possible. – Diego Nov 21 '12 at 11:21

3 Answers3

3

You can try this:

^([a-zA-Z]+).*\s([a-zA-Z])[a-zA-Z]+$

See it here on Regexr

You will find the first name in the first capturing group, mostly referred to as $1 and the first character of the last name in $2

This will work, if the names are only consisting of letters a-z or A-Z. To give you a better answer, you need to give the specifications of your names and what tool you are using.

stema
  • 90,351
  • 20
  • 107
  • 135
  • This matches the entire string. – Diego Nov 21 '12 at 11:22
  • Thanks for the quick reply. I'm not too bothered about special cases with unusual characters as they could be tidied up manually if necessary. As long as the bulk of them work. How would I return the contents of the two capture groups within a single field that accepts a regex? – Graeme Smith Nov 21 '12 at 11:28
  • @Diego, why? The dot does not match newline characters by default. I don't know how he gets that information, I assume the files are read row by row, then my solution is fine, if he gets the complete file content, then I need to enable the `multiline` mode `m`, but it is still not matching the entire string. – stema Nov 21 '12 at 11:28
  • For reference this is the homepage for the tool: http://spc3.codeplex.com/wikipage?title=ProfileImageUpload&referringTitle=Home – Graeme Smith Nov 21 '12 at 11:30
  • Depending on what the tool uses for replacements it would either be `\1\2` or `$1$2` – garyh Nov 21 '12 at 11:31
  • @GraemeSmith, the information on that side is a bit sparse, but I think chances are good, that my solution is working, maybe you can even omit the anchors `^` and `$`. You have to try it. – stema Nov 21 '12 at 11:35
  • With the entire string I mean the entire line, not all lines. – Diego Nov 21 '12 at 11:44
  • @Diego, yes thats what it is expected to do. – stema Nov 21 '12 at 11:48
  • I've copied and pasted your expression exactly but the tool evaluates "Graeme Smith" to "Graeme\S". – Graeme Smith Nov 21 '12 at 11:52
  • Am I still too sleepy or what? Doesn't he want to get `foob` from `foo bar`? – Diego Nov 21 '12 at 11:52
  • @Diego, I am matching the whole row and capture the parts he want to retrieve with capturing groups. – stema Nov 21 '12 at 11:54
  • @Diego I think your missing a trick with the captured groups. – Graeme Smith Nov 21 '12 at 11:56
  • @GraemeSmith, your tool is inserting a backslash between the two captured parts. Maybe you can change this somewhere in the settings? – stema Nov 21 '12 at 11:56
  • That's annoying. Unfortunately there aren't any settings at all. – Graeme Smith Nov 21 '12 at 11:59
  • I've submitted an issue on the project page, will see what happens. Your solution certainly seems to work though. – Graeme Smith Nov 21 '12 at 15:07
0

Try this (Perl-style):

s/(\w+).*?\s(\w)(\w+)$/$1$2/
mvp
  • 111,019
  • 13
  • 122
  • 148
0

Match each file name with this pattern:

^(\w+)\s+(\w+\s+)?(\w)\w*$

And replace the matched string with the following pattern to produce the account name:

$1$3
Sina Iravanian
  • 16,011
  • 4
  • 34
  • 45