0

I'm trying to parse some css files using a code project parser found here. Basically I need take a css file from ftp turn it into a string, parse it so I can enumerate the css classes. Using the default .net ftp client the string returned works correctly with the parser. Using the ftp parser we use in our project it fails(CuteFTP). This is where things get strange. From what I've been able to tell the two css files returned as strings are identical, they come from the same css file. So why would one work and the other fail, is there some hidden formatting? I've confirmed that both ftp clients are using utf8 encoding. Here are the two css classes returned as strings. I've uploaded a vs2010 project showing the problem here. Any help would be greatly appreciated... this is one of the problems that has got me scratching my head. Thanks

string cssThatWorks = "\r\n.uploadfiles_button{\r\n    color:#529214; \r\nborder:1px solid #C6D880;\r\ndisplay:inline-block;\r\n    margin:0 7px 0 0;\r\n    font-family:\"Lucida Grande\", Tahoma, Arial, Verdana, sans-serif;\r\n    font-size:12px;\r\n    line-height:130%;\r\n    text-decoration:none;\r\n    font-weight:bold;\r\n    cursor:pointer;\r\n    padding:5px 10px 6px 7px; \r\n}\r\n\r\n\r\n\r\n";

string cssThatFails = "\r\n.uploadfiles_button{\r\n    color:#529214; \r\nborder:1px solid #C6D880;\r\ndisplay:inline-block;\r\n    margin:0 7px 0 0;\r\n    font-family:\"Lucida Grande\", Tahoma, Arial, Verdana, sans-serif;\r\n    font-size:12px;\r\n    line-height:130%;\r\n    text-decoration:none;\r\n    font-weight:bold;\r\n    cursor:pointer;\r\n    padding:5px 10px 6px 7px; \r\n}\r\n\r\n\r\n\r\n";

Update

It looks like there is a UTF8 Identifier at the beginning of the string so I added the following code which should remove it. The true passed to the constructor should skip it but it doesn't. Any ideas?

  UTF8Encoding utf8 = new UTF8Encoding(true);
  Byte[] encodedBytes = utf8.GetBytes(cssThatFails);
  string cssWithoutUTF8Identifier = utf8.GetString(encodedBytes);
NullReference
  • 4,404
  • 12
  • 53
  • 90

1 Answers1

1

I opened the Default.aspx.cs file from your sample project in TextPad and saw a '?' character at the beginning of the "cssThatFails" string, indicating an unknown character. So I opened the same file in VS's binary editor and noticed that the "cssThatFails" string has the UTF-8 byte-order marker at the beginning (0xEFBBBF). That is the likely culprit.

bbogovich
  • 341
  • 1
  • 7
  • Thanks for taking a look. How would I account for that other than just removing the first character? – NullReference Aug 12 '11 at 16:21
  • All strings in C# are represented as UTF16 in memory, so trimming off the equivalent UTF16 BOMs should work. `string cssWithoutUTF8Identifier = cssThatFails.TrimStart(new char[] { '\xFEFF','\xFFFE' });` – bbogovich Aug 12 '11 at 17:04
  • found the solution here http://stackoverflow.com/questions/1317700/strip-byte-order-mark-from-string-in-c – NullReference Aug 12 '11 at 17:05