-1

Problem: The "del" command won't delete certain files when it is executed from a batch file but it will delete those files if it is executed from the CMD shell command line interpreter. The problem appears to be related to the fact that the filenenames contain an ellipsis character.

I am not new to batch files, I have been writing and using batch files since 1988 with MSDOS 3.2, through Windows 7. Now I am using Windows 10 Pro 21H1. What follows is a description of the problem.

A directory list dir shows that the folder contains the following five files:

Volume in drive E has no label.
 Volume Serial Number is 3D74-5A3F

Directory of E:\Backups\Magpie\2016-12-14-P\mike\Correspondence\Friends\Berg\In

10/21/2021  03:27 PM    <DIR>          .
10/21/2021  03:27 PM    <DIR>          ..
08/16/2021  03:15 AM            40,888 20160114-Fwd_ [New post] Biologist's comments to save White Mountain & Little Colorado wild horses (Today is the last day YOU can comment)-29324528.eml
08/16/2021  03:15 AM            21,095 20160229-More.-13257179.eml
08/16/2021  03:15 AM            47,526 20160229-Re_More…-13334902.eml
08/16/2021  03:15 AM        11,256,759 20160819-(Duplicate) Whew. Re_ Good thurs am God bless your first day back to work-10.eml
10/21/2021  02:48 PM               182 DeleteDuplicatesDC.bat
10/20/2021  06:19 PM    <DIR>          Photos
10/21/2021  02:00 PM    <DIR>          Temp

               6 File(s)     11,366,450 bytes
               4 Dir(s)  199,814,967,296 bytes free

I want to delete the third file whose name is 20160229-Re_More…-13334902.eml Note the three dots following the word More in the filename. It is a single ellipsis character. It is NOT three separate dots.

The batch file was generated by a macro I wrote to assemble instructions to delete hundreds or thousands of individual files that meet certain criteria.

My batch file has the following commands:

Echo   271
CD "E:\Backups\Magpie\2016-12-14-P\mike\Correspondence\Friends\Berg\In\"
del "20160229-Re_More…-13334902.eml"
Echo   271

The Echo commands simply write a number to the screen to show the progress of the batch file. The batch file contains hundreds of similar lines which delete hundreds of files in various directories. When I ran the batch command it deleted most of the files it was supposed to delete but it would not delete the file: 20160229-Re_More…-13334902.eml

In order to check if the commands were valid and correct I ran them one at a time from a command line using the following steps. In a CMD shell CLI (Command-Line-Interpreter" I Copy-Pasted the command

CD "E:\Backups\Magpie\2016-12-14-P\mike\Correspondence\Friends\Berg\In\"

from the batch file into a CMD command line shell and executed it. It took me to the correct directory which was E:\Backups\Magpie\2016-12-14-P\mike\Correspondence\Friends\Berg\In\.

I then Copy-Pasted the following command from batch file into the CMD shell to delete the file:

del "20160229-Re_More…-13334902.eml"

and executed it. The del command deleted the file successfully (as it was supposed to do). The file also disappeared from a File Explorer window that I had open to monitor progress.

This test showed that the two commands did what they were supposed to do, which was to move to a certain directory (folder) and delete a certain file.

But the del command did NOT delete the file with the ellipsis character when I executed it from the batch file. I don't understand why these commands cd and del work when I execute them individually from a CMD shell CLI but the del command did not delete the file I wanted to delete when I ran it in a batch file?

Does anyone have the answer to this problem? Thank you Michael

Compo
  • 36,585
  • 5
  • 27
  • 39
Mike Walsh
  • 1
  • 1
  • 1
  • 3
    Too much text, difficult to parse. Can you see if you can [edit] and whittle down about 80% of the text you posted so that you can clearly state the problem you're having, include the relevant portions of your code (properly formatted, so that it's readable), and ask a single, specific question? For some suggestions, see [ask] and [mre]. – Ken White Oct 22 '21 at 20:21
  • 2
    The "reproducible" part of the MRE definition is particularly critical. If we'd need to have a file that nobody but you has to test a proposed fix, then nobody can answer with confidence that their proposed solution really does address the issue. – Charles Duffy Oct 22 '21 at 20:54
  • 3
    That said -- do you know which encoding your text editor saved the batch file with? An encoding mismatch certainly _smells_ like a likely cause here. – Charles Duffy Oct 22 '21 at 20:55
  • Charles,The batch file was not created with a text editor. Each line of the batch file is created and written by an Excel (2003) macro. The macro tests if each file in the list can be found on a different drive and if so it writes a line to the batch file to delete that file.meet certain criteria to another text file. – Mike Walsh Oct 24 '21 at 10:54

1 Answers1

1

The file deletion does not work because of a character encoding is used by the macro on creating the batch file which is different to the character encoding used by the Windows command processor cmd.exe on processing the batch file.

There are in general two character encodings used on Windows which use just one byte per character and therefore can encode only 256 characters. There are used code pages for such a single byte per character encoding which define which code value (binary byte value) represents which character. The code pages used by Windows by default for a character encoding with just one byte per character depend on:

  1. The country, region and language set for the used account. It makes a difference if Germany or Russia or Brazil or China is configured for an account.
  2. The execution environment in which the binary byte stream representing a text is interpreted. The Windows GUI applications like text editors use by default the so called ANSI code page according to configured country while the Windows command processor cmd.exe uses an OEM code page according to configured country.

The ANSI code page is Windows-1252 for a North American and a Western European country. The ellipsis character is encoded with decimal code value 133 (hexadecimal 85) with this code page.

The OEM code page is 437 for a North American country and 850 for a Western European country. The two code pages do not contain the ellipsis character at all. The decimal code value 133 (hexadecimal 85) represents in those two code pages the character à.

So if a batch file is created in a text editor (or using a macro) which uses a single byte per character encoding with the code page Windows-1252, the command line del "20160229-Re_More…-13334902.eml" results in the byte stream:

64 65 6C 20 22 32 30 31 36 30 32 32 39 2D 52 65
5F 4D 6F 72 65 85 2D 31 33 33 33 34 39 30 32 2E
65 6D 6C 22

This byte stream is interpreted using code page 437 or 850 as:

del "20160229-Re_Moreà-13334902.eml"

The Unicode encoding UTF-16 Little Endian uses two bytes per character for characters of the Basic Multilingual Plane (and four bytes for characters of Supplementary Planes). UTF-16 LE with byte order mark (BOM) is used by WMIC for every output.

The command line del "20160229-Re_More…-13334902.eml" as byte stream wit UTF-16 LE encoding with BOM (FF FE as first two bytes) would be:

FF FE 64 00 65 00 6C 00 20 00 22 00 32 00 30 00
31 00 36 00 30 00 32 00 32 00 39 00 2D 00 52 00
65 00 5F 00 4D 00 6F 00 72 00 65 00 26 20 2D 00
31 00 33 00 33 00 33 00 34 00 39 00 30 00 32 00
2E 00 65 00 6D 00 6C 00 22 00

The ellipsis character is encoded in this case with two bytes with the hexadecimal values 26 20 and all other characters also with two bytes whereby the second byte has the value 00.

But the Windows command processor cmd.exe does not support UTF-16 LE on processing a batch file. So it is of no help to save the batch file with this Unicode encoding.

Another Unicode encoding is UTF-8 which uses a variable number of bytes per character depending on the character. The command line del "20160229-Re_More…-13334902.eml" is encoded with UTF-8 without BOM with the byte stream:

64 65 6C 20 22 32 30 31 36 30 32 32 39 2D 52 65
5F 4D 6F 72 65 E2 80 A6 2D 31 33 33 33 34 39 30
32 2E 65 6D 6C 22

The ellipsis character is encoded in this case with three bytes with the hexadecimal values E2 80 A6 while all other characters are encoded with just one byte.

So what could be done to get deleted the file with name 20160229-Re_More…-13334902.eml using a batch file?

There can be opened a command prompt window and executed the command chcp to get displayed the code page used by cmd.exe according to the country configured for the used account. The batch file should be written using the same code page. But that is of no help on OEM code page is 437 or 850 as is not available at all in those code pages.

A working solution is encoding the batch file with Windows-1252 and use following two command lines:

%SystemRoot%\System32\chcp.com 1252
del "20160229-Re_More…-13334902.eml"

The first line changes the code page to Windows-1252. For that reason the Windows command processor interprets the next command line now with this code page and the file deletion works although displayed is nevertheless del "20160229-Re_Moreà-13334902.eml" in the console window.

Another solution is encoding the batch file with UTF-8 using the following command lines:

%SystemRoot%\System32\chcp.com 65001
del "20160229-Re_More…-13334902.eml"

The value 65001 is the code page number which Microsoft defined for UTF-8 encoding. It is not really a code page number as UTF-8 is a Unicode encoding and not a code page.

The deletion of the file works although it could be displayed on execution:

delThe system cannot write to the specified device.

The reason for the strange error message output after command del instead of the space and the file name in double quotes is described by my answer on Using another language (code page) in a batch file made for others and the comments below the answer written by Eryk Sun.

One more solution is the usage of the wildcard pattern character ? for the non-ASCII character in the file name.

del "20160229-Re_More?-13334902.eml"

The disadvantage of this solution is that other files could be deleted by chance too on being also matched by the wildcard pattern ? like 20160229-Re_More1-13334902.eml and not only the file with the ellipsis character in file name.

Mofi
  • 46,139
  • 17
  • 80
  • 143