0

I need to recognize a unicode special character into CMD console, the special char is this: (it's like a normal dash but it's not. It is the U+2013 character according to Wikipedia.

PS: you can find that character in the filename by downloading any file to your PC... Not all the downloaded filenames have that character but many servers transcodes the normal dash by that dash in the filename of the downloaded file and I don't know why...

If i have an MP3 file with that char in the filename, and then I try to do a "dir /B"command in the cmd then I get this:

C:\>dir /B
this is a - test.mp3

At this part all is ok... But what happens if I copy the filename to the clipboard (direct from the "dir" output) and I try to rename the file?:

C:\>REN "this is a - test.mp3" "badchar.mp3"
El sistema no puede encontrar el archivo especificado.

PS: The error in English says "The system cannot find the specified file."

So how I can do to recognize that char?

I've tried changing the CHCP codepages, tried to start a unicode CMD (CMD /U), tried to do the trick with "copy con" command, tried string character replacing... But nothing...

I don't want to use third-party apps like massive renamers because strictly I need to recognize natively the character as I need to process some files with the "dir" command...

The problem is that the "dir" command recognizes the character... But after that, how can I test any other command with that file? If only "dir" command recognizes it and I need to open that file into a external app... :(

Thanks for reading.

DETAILS:

A sample filename:

Ambassador Inc. – It's All Confusion.mp3

A detailed sample script of the problem:

@echo off
FOR /F "tokens=*" %%# in ('DIR /B *.mp3') do (
    Echo "%%#"
    RENAME "%%#" "fucking unicode dash that i can't delete it.."
    REM any_program.exe "%%#"
)
pause&exit
ElektroStudios
  • 19,105
  • 33
  • 200
  • 417
  • I can't reproduce the problem. I can rename the file using any of the fonts on my system (Consolas, Raster, Lucida). my console code page is 437. I just used tab completion to select the filename: `ren this`. Try that. What is your code page? – Mark Tolonen Oct 17 '12 at 15:30
  • Note that this is precisely one reason why you should never use `for /f` on `dir` output. Just use `for %%# in (*.mp3)` instead. – Joey Feb 18 '14 at 13:12

1 Answers1

2

Seems like "common unicode problem". When raster font is selected in console - unicode is not available. Two ways to fix this:

  1. Open console window properties, goto "Fonts" page, choose "Lucida Console" or "Consolas". Press "OK".

  2. Use alternative consoles. ConEmu for example (I'm the author). This type of software solves many problems of plain console window.

Now you may dir

Community
  • 1
  • 1
Maximus
  • 10,751
  • 8
  • 47
  • 65
  • AWESOME knowledges about the internal secrets of the CMD!!! i was impressed, thankyou i've choosed the 1st way but i'll download conemu for test it – ElektroStudios Oct 17 '12 at 03:16
  • hi again, the first trick only take effect directly into the console, but if i try to run a script i still can't recognize that character... (i have lucida console font ON), please do you know a way to take effect running a script too? i've tried to run a script with conemu but i still get nothing. sorry for my english and thankyou again – ElektroStudios Oct 17 '12 at 03:46
  • What "script" do you talking about? – Maximus Oct 17 '12 at 05:48
  • hi, well i've discovered this is a problem of FOR /F. i've added a sample script in my question using FOR /F for you... thankyou if you can help me more. (with FOR /R i can recognize the unicode dash but i need to use /F...) – ElektroStudios Oct 18 '12 at 03:16
  • Thought you need to `chcp 65001` before calling the batch script. And of course the script, if it contains unicode chars, must be in 65001 codepage (UTF-8). – Maximus Oct 18 '12 at 07:26