3

I'm trying to delete .doc files in a folder that also contains .docx files too.

This is my attempt so far:

string[] files = Directory.GetFiles(Path, "*.doc", SearchOption.AllDirectories);

foreach (string f in files)
{
    File.Delete(f);
}

It deletes word documents with extensions of .doc and .docx. I want to delete .doc files only and keep .docx files.

Mahmoud Elgindy
  • 114
  • 1
  • 11

6 Answers6

3

The MSDN documentation for the Directory.GetFiles Method (String, String, SearchOption) includes this note:

When you use the asterisk wildcard character in a searchPattern such as "*.txt", the number of characters in the specified extension affects the search as follows:

•If the specified extension is exactly three characters long, the method returns files with extensions that begin with the specified extension. For example, "*.xls" returns both "book.xls" and "book.xlsx".

•In all other cases, the method returns files that exactly match the specified extension. For example, "*.ai" returns "file.ai" but not "file.aif".

When you use the question mark wildcard character, this method returns only files that match the specified file extension. For example, given two files, "file1.txt" and "file1.txtother", in a directory, a search pattern of "file?.txt" returns just the first file, whereas a search pattern of "file*.txt" returns both files.

The easiest way to work around Microsoft being "helpful" in this manner is to filter the results of the Directory.GetFiles call:

string[] files = Directory.GetFiles(filesPath, "*.doc", SearchOption.AllDirectories);

foreach (string f in files.Where(f => Path.GetExtension(f) == ".doc"))
{
    File.Delete(f);
}

I renamed your Path variable because it clashes with the System.IO.Path class which holds the static GetExtension method. As a general rule of thumb, giving variables the same name as existing classes is a bad habit.

Edmund Schweppe
  • 4,992
  • 1
  • 20
  • 26
2

Filter the results for the exact extension you are after.

string[] files = Directory.GetFiles(Path, "*.doc", SearchOption.AllDirectories);
foreach (string f in files.Where(f => String.Compare(".doc", f.Extension, StringComparison.OrdinalIgnoreCase) == 0))
{
    File.Delete(f);
}
Prabu
  • 4,097
  • 5
  • 45
  • 66
2

You can try

DirectoryInfo Dir = new DirectoryInfo(path);


foreach (FileInfo file in Dir.GetFiles())
{

    //Code
}

as you can access the file extension with

file.Extension

like this. I guess that's safer to use

Sossenbinder
  • 4,852
  • 5
  • 35
  • 78
1

That problem occurs because Windows ignores any part of a file extension that is longer than 3 characters.

Changing your code to this will solve it:

var files = Directory
   .GetFiles(Path, "*.doc", SearchOption.AllDirectories)
   .Where(w => w.ToLowerInvariant().EndsWith(".doc"));
Ulric
  • 826
  • 5
  • 16
  • This answer worked for me – Mahmoud Elgindy Sep 04 '15 at 13:43
  • 3
    The statement "That problem occurs because Windows ignores any part of a file extension that is longer than 3 characters." is **wrong**. The OP has run into a place where Microsoft tried to be "helpful" and ended up being confusing instead. See the [MSDN documentation](https://msdn.microsoft.com/en-us/library/ms143316(v=vs.110).aspx). – Edmund Schweppe Sep 04 '15 at 14:00
  • When you say: "tried to be "helpful"" - do you mean it ignored any part of the file extension that was longer than 3 digits? It's seems to me that "Ignores digits" and "being helpful" is a distinction without a difference. :) – Ulric Sep 04 '15 at 14:16
  • 1
    @Ulric: no, by 'tried to be "helpful"' I meant that, "**in the specific case of a three-letter file extension** it returns files with extensions beginning with those three characters". IIRC, Microsoft threw this hack in when Office 2007 came out, so that file dialogs hardcoded with (e.g.) "*.ppt" would show all PowerPoint documents (whether using the .ppt or .pptx extensions). If you pass a wildcard with fewer or more than three characters in the extension, `Directory.GetFiles` works the way you'd expect (returning only those files matching the extension). – Edmund Schweppe Sep 04 '15 at 17:30
  • 1
    _"I meant that, "in the specific case of a three-letter file extension it returns files with extensions beginning with those three characters"."_ ...which involves ignoring any part of the file extension that was longer than 3 digits. As I said: a distinction without a difference. – Ulric Sep 06 '15 at 01:36
1

You can get first extension, put a check for .doc file. then call delete function.

string extension = System.IO.Path.GetExtension(@"c:\yourfile.docx");
if(extension != ".docx")
{
   //DELETE FILE HERE
}
M_Idrees
  • 2,080
  • 2
  • 23
  • 53
  • Would this not delete any file that is not equal to .docx passed into GetExtension? Even if the file is .DOCX? it will still be deleted – David B Sep 04 '15 at 13:40
  • There we placed a checkl if(extension != ".docx") . So you will only get those files where extension NOT EQUAL TO (!=) '.docx' – M_Idrees Sep 04 '15 at 13:42
1
        string[] files = Directory.GetFiles(Path, "*.doc", SearchOption.AllDirectories);
        foreach (string f in files.Where(f => !f.EndsWith(".docx")))
        {
            File.Delete(f);
        }

Microsoft provides examples of this problem in their overview of the method DirectoryInfo.GetFiles Method (String, SearchOption) (https://msdn.microsoft.com/en-us/library/ms143327(v=vs.110).aspx). They state:

The following list shows the behavior of different lengths for the searchPattern parameter:

  • "*.abc" returns files having an extension of.abc,.abcd,.abcde,.abcdef, and so on.
  • "*.abcd" returns only files having an extension of.abcd.
  • "*.abcde" returns only files having an extension of .abcde.
  • "*.abcdef" returns only files having an extension of .abcdef.

You need to filter the result set of Directory.GetFiles so that you're only operating on the files that you want.

Will Lokes
  • 26
  • 1
  • 5