10

How can i convert from a Unicode path name (LPWSTR) to the ASCII equivalent? The library that gets called understands only c strings.

Edit: Okay, I took the GetShortPathName and the WideCharToMultiByte suggestions and created that piece of code, i tested it with some folders containing Unicode characters in the path and it worked flawlessly:

wlength = GetShortPathNameW(cpy,0,0);
LPWSTR shortp = (LPWSTR)calloc(wlength,sizeof(WCHAR));
GetShortPathNameW(cpy,shortp,wlength);
clength = WideCharToMultiByte(CP_OEMCP, WC_NO_BEST_FIT_CHARS, shortp, wlength, 0, 0, 0, 0);
LPSTR cpath = (LPSTR)calloc(clength,sizeof(CHAR));
WideCharToMultiByte(CP_OEMCP, WC_NO_BEST_FIT_CHARS, shortp, wlength, cpath, clength, 0, 0);
metafex
  • 179
  • 2
  • 11

1 Answers1

6

GetShortPathName() Function

http://msdn.microsoft.com/en-us/library/aa364989%28VS.85%29.aspx

Will give you an equivalent 8.3 filename, pointing to the same file, for use with legacy code.

[EDIT] This is probably the best you can do, although theoretically the 8.3 filenames may contain non-ascii characters, depending on registry setting. In this case, you don't have an easy way of getting the proper char*, and GetShortPathNameA() will not do that either if codepage setting during file creation does not match current setting.

See http://technet.microsoft.com/en-us/library/cc781607%28WS.10%29.aspx about the setting. There's a concensus here (see below) that this case is reasonable to neglect.

Thanks Moron, All, for contribution to this post.

Pavel Radzivilovsky
  • 18,794
  • 5
  • 57
  • 67
  • But isn't the short path LPWSTR too? Perhaps OP is looking for something like WideCharToMultiByte? –  Jun 01 '10 at 17:22
  • I'm actually looking for the combination of those both. – metafex Jun 01 '10 at 17:30
  • @metafex: Perhaps you should edit your question then. This being the accepted answer does not seem to make sense, with the question being what it is currently. –  Jun 01 '10 at 17:34
  • @moron No, 8.3 path is guaranteed to be ASCII only, 7 bits per byte – Pavel Radzivilovsky Jun 01 '10 at 17:42
  • @metafex Actually, you should be able to call the ASCII version. Doesn't the LPCTSTR type resolve to either "const char *" or "const wchar *" depending on the UNICODE (or is it _UNICODE?) macro. Whenever that is the case the function typically has an ASCII and a WIDE version, in this case GetShortPathNameA and GetShortPathNameW. You'll find that GetShortPathName is just a macro defined to one of these depending on the UNICODE macro. If you want the ASCII version even though UNICODE is defined (which it is by default) call GetShortPathNameA with an ASCII string. – torak Jun 01 '10 at 17:46
  • If all the OP wants is to have an ASCII string they can safely pass to a lib that doesn't support Unicode, then this is a reasonable solution. – Steven Sudit Jun 01 '10 at 17:50
  • @Pavel: while the 8.3 name might contain character codes that are in the ASCII range, if the UNICODE version of the function is called (ie., the input `LPCTSTR` long path parameter is a pointer to wide characters), then the 8.3 short name returned will also be wide characters which will be unsuitable for passing to a legacy function expecting a regular, old `char*`. Some sort of conversion from wide chars to chars will still need to take place (even if that conversion is trivial) or the legacy function will only see the first characters of the 8.3 path. – Michael Burr Jun 01 '10 at 17:59
  • @Pavel: Where does it ever say that GetShortPathName returns ASCII encoded strings? Please point me to a reference which promises that. –  Jun 01 '10 at 18:05
  • @Moron trust me on that part... though I'd happy to see a reference myself. I guess google might help, I'll add it as edit if you find. BTW, despite *A functions are crap, it is actually justified in this case, coz it's exactly the result you want to get. – Pavel Radzivilovsky Jun 01 '10 at 20:53
  • @Pavel: Sorry, I cannot trust you on this without a reference. Also, could you please show us complete code, which given a path in LPWSTR (note the W), gives us back a path in LPSTR (note, no W) which is ASCII encoded? –  Jun 01 '10 at 22:24
  • @Moron: I tried to search and cannot find a documentation reference. However, I have never seen a non-ascii 8.3 path in windows. – Pavel Radzivilovsky Jun 06 '10 at 12:53
  • @Pavel: You cannot find it because because it is ANSI. Not ASCII. The conversion that happens depends on the codepage. –  Jun 06 '10 at 14:08
  • @moron, I think this can't be true because this string is stored on disk for every file, and default codepage setting in the OS can change. – Pavel Radzivilovsky Jun 06 '10 at 16:44
  • @Pavel: You are right about the codepage: in fact I guess being independent of the code page is a feature of short path names. Anyway, it is possible for shortnames to have non-ascii characters: http://technet.microsoft.com/en-us/library/cc781607(WS.10).aspx. A registry key setting which probably no one ever touches and is probably 0 by default (have to check a different language machine). So GetShortPathName will likely give you ASCII encoded strings, but it is not 100% guaranteed as you seemed to have claimed. Anyway, GetShortPathName seems to be the _only_ solution out there! –  Jun 06 '10 at 17:32
  • There's no guarantee a short path name will exist for an arbitrary file on an arbitrary disc. Entire volumes may have short paths disabled. See https://msdn.microsoft.com/en-nz/library/windows/desktop/aa365247(v=vs.85).aspx - "Note Not all file systems follow the tilde substitution convention, and systems can be configured to disable 8.3 alias generation even if they normally support it. Therefore, do not make the assumption that the 8.3 alias already exists on-disk." – fragorl May 04 '15 at 02:57