4

I want a regex in such a way that to replace the filename which contains special characters and dots(.) etc. with underscore(_) except the extension of the filename.

Help me with an regex

Eilon
  • 25,582
  • 3
  • 84
  • 102
venkat
  • 5,648
  • 16
  • 58
  • 83
  • The Regex: ([!@#$%^&*()]|(?:[.](?![a-z0-9]+$))) is not processing if the extensions are in UpperCase....say for word documents DOC,TXT,HTML,XML,PDF,GIF,JPEG etc Please suggest a solution – venkat Jan 19 '10 at 12:37

4 Answers4

6

try this:

([!@#$%^&*()]|(?:[.](?![a-z0-9]+$)))

with the insensitive flag "i". Replace with '_'

The first lot of characters can be customised, or maybe use \W (any non-word)

so this reads as:

replace with '_' where I match and of this set, or a period that is not followed by some characters or numbers and the end of line

Sample c# code:

var newstr = new Regex("([!@#$%^&*()]|(?:[.](?![a-z0-9]+$)))", RegexOptions.IgnoreCase)
    .Replace(myPath, "_");
Luke Schafer
  • 9,209
  • 2
  • 28
  • 29
  • added c# code - originally didn't realise it was a c# question – Luke Schafer Jan 19 '10 at 07:02
  • The Regex: ([!@#$%^&*()]|(?:[.](?![a-z0-9]+$))) is not processing if the extensions are in UpperCase....say for word documents DOC,TXT,HTML,XML,PDF,GIF,JPEG etc. – venkat Jan 19 '10 at 12:34
  • uh... yes it is. I mention twice, once on the third line, and once in the example code, to make it case insensitive – Luke Schafer Jan 20 '10 at 22:29
2

Since you only care about the extension, forget about the rest of the filename. Write a regex to scrape off the extension, discarding the original filename, and then glue that extension onto the new filename.

This regular expression will match the extension, including the dot.: \.[^.]*$

Wayne Conrad
  • 103,207
  • 26
  • 155
  • 191
1

Perhaps just take off the extension first and put it back on after? Something like (but add your own list of special characters):

static readonly Regex removeChars = new Regex("[-. ]", RegexOptions.Compiled);
static void Main() {
    string path = "ab c.-def.ghi";
    string ext = Path.GetExtension(path);
    path = Path.ChangeExtension(
        removeChars.Replace(Path.ChangeExtension(path, null), "_"), ext);
}
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
1

Once you separate the file extension out from your string would this then get you the rest of the way?

Community
  • 1
  • 1
keith
  • 3,105
  • 2
  • 29
  • 34