1

I have a VS2013 project with a custom build command. In the command script I set an environment variable, and read it out again in the same script. I can confirm by calling set that setting the variable works. However, depending on the variable name, I can't read it out again.

The following works as expected when run as a batch script:

set AVAR=xxx
set ABLAH=xxx
set BBLAH=xxx
set DEV=xxx
set @ABLAH=xxx

echo %AVAR%
echo %ABLAH%
echo %BBLAH%
echo %DEV%
echo %@ABLAH%

But produces the following output in the project:

1>  xxx
1>  «LAH
1>  »LAH
1>  ÞV
1>  xxx

In this case, the name AVAR works, but many others don't. Also, variables starting with @ seem to work. Any idea what is going on?

jdm
  • 9,470
  • 12
  • 58
  • 110
  • I cannot reproduce the problem. The batch script works as expected on my machine. – seva titov Jan 16 '15 at 21:18
  • 1
    One more thing. The non-ASCII characters printed in output appear to be a hex representation of %HH, where HH are two hexadecimal digits from the identifier. E.g. `«` = 0xAB, `»` = 0xBB, etc. Non-hexadecimal apparently are not interpreted. I am not aware of this specific encoding, but it might give you a hint. – seva titov Jan 16 '15 at 21:22

2 Answers2

3

I've found the solution. Visual Studio (msbuild) converts %XX escape sequences like in URLs. I only expected it to so in URLs, like browsers do. However, it seems to replace them everywhere.

So when it encounters %ABCDE%, it recognizes %AB and inserts the character « = 0xAB, giving «CDE% to the batch interpreter. But if the code is not a valid hexadecimal number, it silently ignores it, and the interpreter sees the right characters. That's why variable names with @ at the beginning always worked.

So the solution is to escape at least all % in front valid hex codes 00-FF, better even all of them, with %25.

An easy solution would be to just edit the corresponding commands in the GUI (via project properties), and not directly in the .vcxproj or .props file. This way, VS inserts the correct escape codes. In my case this was not possible since the commands were defined as user macros (Property Pages: Common Properties/User Macros). My commands span multiple lines, but the user macro editor only supports single lines.

Another thing to watch out for is that it not only replaces percent signs. Other symbols have special meaning and have to be replaced, too. (This goes beyond XML entities, like & -> &.) Here is a list of special characters from MSDN. The characters are: % $ @ ' ; ? *. It doesn't seem to be necessary to replace all of them all the time, but if you notice funky behavior then this is a thing to look at. You can try to enter these characters through the GUI and see how and if VS escapes them in the project file.

On other character to note especially is the semicolon. If you define a property with unescaped semicolons, like <MyPaths>DirA;DirB</MyPaths>, msbuild/VS will internally convert them to newlines (well, or it splits the property into a list or something). But it will still show the paths as separated with semicolons in the property pages! Except when you click the dropdown button next to a property and select <Edit...>, then it will show the paths as a list or separated by newlines! This is completely invisible most of the time, except when you set a property not in XML or the GUI, but you are reading the output of a command into a property. In this case the command must output newlines, if you want the effect of a semicolon. Otherwise you don't get multiple paths, but one long path with semicolons in it.

Community
  • 1
  • 1
jdm
  • 9,470
  • 12
  • 58
  • 110
1

Batch files are usually in North American and Western European countries "ASCII" files using an OEM code page like code page 850 (OEM multilingual Latin I) or code page 437 (OEM US) and not code page Windows-1252 as used usually for single byte encoded text files. The code page to use for a batch file depends on local settings for non Unicode files in console. The code page does not matter if just characters with a code value smaller 128 are used in batch file, i.e. the batch file is a real ASCII file.

Therefore make sure that you edit and save the batch file as ASCII file using the right code page and not as Unicode file using UTF-8, UTF-16 Little Endian or UTF-16 Big Endian. Editor of Visual Studio uses by default UTF-8 encoding for the files. This is the wrong encoding for batch files.

Character « has in table of code page 850 the code value 174 decimal (0xAB). In table of code page 1252 code value 174 is for character ® which is an indication that you want to output in batch file characters encoded in UTF-8 (also code value 174 for character ®) or Windows-1252.

A simple batch code for demonstration stored as ANSI file with code page Windows-1252.

@echo off
cls
echo This batch file was saved as ANSI file using code page Windows-1252.
echo.
echo Registered trademark symbol ® has code value 174 in Windows-1252.
echo.
echo But active code page is not Windows 1252 in console window.
echo.
chcp
echo.
echo Therefore the left guillemet character is output instead of registered
echo trademark symbol as this character has in code page 850 code value 174.
echo.
echo Press any key to continue ...
pause>nul

And batch files are for DOS/Windows and should therefore use carriage return + line-feed as line terminator instead of just line-feed (UNIX) or just carriage return (old MAC).

Some text editors display line terminator type and encoding respectively code page somewhere in status bar at bottom of main application window for active file.

Mofi
  • 46,139
  • 17
  • 80
  • 143