Compilation error: stray ‘\302’ in program, etc

Question

I have a problem compiling the following exploit code:

http://downloads.securityfocus.com/vulnerabilities/exploits/59846-1.c

I am using "gcc file.c" and "gcc -O2 file.c", but both of them results in the following errors:

sorbolinux-exec.c: In function ‘sc’:
sorbolinux-exec.c:76: error: stray ‘\302’ in program
sorbolinux-exec.c:76: error: stray ‘\244’ in program
sorbolinux-exec.c:76: error: ‘t’ undeclared (first use in this function)
sorbolinux-exec.c:76: error: (Each undeclared identifier is reported only  once
sorbolinux-exec.c:76: error: for each function it appears in.)

I tried compiling them on both Kali Linux and Ubuntu 10.04 (Lucid Lynx) and got the same result.

Sounds to me like your files contain "national" characters that are not in legal in identifiers or some such. But you really should include in your question the lines that get these errors. — Hot Licks, Oct 05 '13 at 13:29
`\302\244` is the octal representation of the UTF-8 sequence 0xC2 0xA4, which is the currency sign: ¤. — Codo, Oct 05 '13 at 13:45
This question is the ***canonical*** question for the stray character problems often encountered when copy pasting code from webpages, PDF documents, or through chat (e.g., Skype Chat or [Facebook Messenger](https://en.wikipedia.org/wiki/Facebook_Messenger)). Thus, it deserves comprehensive answers. Currently, *only* [twitchdotcom slash KANJICODER's answer](https://stackoverflow.com/questions/19198332/compilation-error-stray-302-in-program-etc/54352836#54352836) fits that bill. — Peter Mortensen, Mar 05 '21 at 05:10
A common one is stray ‘\342’ ‘\200’ ‘\213’ (octal numbers - UTF-8 byte sequence 0xE2 0x80 0x8B, Unicode code point U+200B ([ZERO WIDTH SPACE](https://www.utf8-chartable.de/unicode-utf8-table.pl?start=8192&number=128))). A search/replace in regular expression mode in [Geany](https://en.wikipedia.org/wiki/Geany) for `\x{200B}` worked. — Peter Mortensen, Mar 05 '21 at 10:36
Some compilers (not the one here) [output the error numbers in decimal, not octal](https://stackoverflow.com/questions/12111606). — Peter Mortensen, Mar 06 '21 at 14:28
Related: Also [encountered on the command line](https://stackoverflow.com/questions/61725948/) (with very ***misleading*** messages from the shell about the real nature of the problem) — Peter Mortensen, May 08 '22 at 13:15
[A query to find new duplicates](https://stackoverflow.com/search?tab=newest&q=error%20stray%20in%20program&searchOn=3). — Peter Mortensen, Apr 18 '23 at 16:48
A similar symptom, but not ***really*** stray character is the requirement for alfanumeric ASCII for identifiers (e.g., variable names) in C, C++ and others. I think it should be a ***separate*** canonical question. Here is [a blatant duplicate](https://stackoverflow.com/questions/73749860/what-is-this-stray-problem-i-am-new-to-coding). But where is the canonical question for that one? There must be one from 2008 or 2009. — Peter Mortensen, Apr 18 '23 at 18:01
Queries to find new duplicates with fewer false positives (but probably false negatives as well): A [query for signature \302 (octal)](https://stackoverflow.com/search?tab=newest&q=error%20stray%20302%20in%20program&searchOn=3) (the same as this question) and a [query for signature \342 (octal)](https://stackoverflow.com/search?tab=newest&q=error%20stray%20342%20in%20program&searchOn=3) — Peter Mortensen, Apr 25 '23 at 18:27
Note: The notation is different in Visual Studio Code (and probably others): `\u200B` (instead of `\x{200B}`) — Peter Mortensen, Apr 27 '23 at 13:10
There is also [CE/CP-1250](https://en.wikipedia.org/wiki/Windows-1250) (or similar) encoding (it is not always UTF-8 sequences), though they ***have the same cause*** (copying code from web pages, PDF documents, through chat (e.g. Skype Chat or Facebook Messenger), etc.). [Sample](https://stackoverflow.com/questions/13065790/android-ndk-build-error), but there are more. Should there be separate canonicals, or can the canonical answer also cover CE/CP-1250 encoding? — Peter Mortensen, Apr 27 '23 at 16:05
[An example with UTF-16](https://stackoverflow.com/questions/10345802/how-should-i-use-gs-finput-charset-compiler-option-correctly-in-order-to-com). — Peter Mortensen, Apr 27 '23 at 17:51
Another class is accidentally compiling a binary file. [A sample](https://stackoverflow.com/questions/28759855/daemon-on-embedded-linux-device-using-busybox-be-written-in-c-or-as-a-script). But this canonical question should probably be reserved for UTF-8 sequences, not all the different reasons for stray errors. The other types could have their own canonical question. — Peter Mortensen, Apr 28 '23 at 18:56
There could also be a separate canonical question for when it is actually caused by the text editor (this is much less common than copying from web pages, etc.). [An example](https://stackoverflow.com/questions/34058372/compile-error-stray-302-in-function-int86/34059270#comment134288568_34058372): Likely `&reg` in the C code autocompleted to HTML [`®`](https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#Character_entity_references_in_HTML) and then auto converted to [`®`](https://en.wikipedia.org/wiki/Registered_trademark_symbol)... — Peter Mortensen, May 01 '23 at 14:04
Another signature octal (starting) number is "342" (0xE2). For [Windows-1252](https://en.wikipedia.org/wiki/Windows-1252), some common ones are (octal numbers, corresponding Unicode characters): 221 (LEFT SINGLE QUOTATION MARK), 222 (RIGHT SINGLE QUOTATION MARK), 223 (LEFT DOUBLE QUOTATION MARK), 224 (RIGHT DOUBLE QUOTATION MARK), 226 (EN DASH), 240 (NO-BREAK SPACE), and 344 (LATIN SMALL LETTER A WITH DIAERESIS). — Peter Mortensen, May 05 '23 at 00:29
There are also [the BOM induced ones](https://stackoverflow.com/questions/11025320/stray-377-in-xcode) (signature "\377" (octal)). They should probably have their own canonical (as it is probably not caused by copying characters from strange places, but perhaps by misconfigured text editors/IIDEs(?)). — Peter Mortensen, May 06 '23 at 21:48
177 is a signature ("stray ‘\177’") for the [compiler chewing on binary files on Linux](https://stackoverflow.com/questions/48547750/cannot-use-sourcecpp-from-a-file#comment134363065_48547750) (often due to some problem with the build system/make). — Peter Mortensen, May 07 '23 at 00:25
[Here is](https://stackoverflow.com/questions/44626169/strange-error-in-a-code) an example of a UTF-8 BOM-induced one (and stray errors in decimal, not octal). Hexadecimal 0xEF 0xBB 0xBF. Decimal 239 187 191. Octal 357 273 277. — Peter Mortensen, May 07 '23 at 00:46
There is [one from 2009](https://stackoverflow.com/questions/1933100/iphone-coding-compiler-error-stray-342-stray-200-stray-223-why) (the search engines are reluctant to return older Stack Overflow questions (for whatever reason)), but the answers are less convincing. — Peter Mortensen, May 23 '23 at 16:08
A better example of a [Windows-1252](https://en.wikipedia.org/wiki/Windows-1252) one: *[Stray characters in program ("error: stray ’\223’")](https://stackoverflow.com/questions/13061457/stray-characters-in-program-error-stray-223)* — Peter Mortensen, Jun 16 '23 at 13:03

score 27 · Accepted Answer · answered Oct 05 '13 at 13:31

27

You have an invalid character on that line. This is what I saw:

enter image description here

answered Oct 05 '13 at 13:31

Yu Hao

119,891
44
235
294

thanks, but this removes only 2 lines of errors and still these errors exist raw.c: In function ‘sc’: raw.c:76: error: ‘t’ undeclared (first use in this function) raw.c:76: error: (Each undeclared identifier is reported only once raw.c:76: error: for each function it appears in.) – Ahmed Taher Oct 05 '13 at 13:45
1

@AhmedTaher: The fix certainly removes the error messages in your question. If other errors remain, please add them to your question. – Codo Oct 05 '13 at 13:47
5

Most likely the line `uint64_t *p = (void *) ¤t[i];` needs to be changed to `uint64_t *p = (void *) &current[i];`. (`¤` is the HTML entity for the currency sign. – Codo Oct 05 '13 at 13:50
If you remove the currency sign from your code, these error messages can no longer be produced. It's simply impossible. – Codo Oct 05 '13 at 13:52
That was weird, is it possible (given that the code is for an exploit) that the unicode char was intended to be a filter of script kiddies? – Evan Teran Nov 13 '13 at 12:42
1

The real explanation is that browsers used to compete a lot on their ability to render really bad HTML code with lots of mistakes in it. OP's browser, when displaying the code example, saw a sequence of characters that began with an ampersand and ended with a semicolon, that didn't exactly match an HTML entity, but was close, and it decided to do the replacement, but also show the extra text...this was an HTML parsing error on the part of the browser, because it was trying to be too helpful. – Theodore Murdock May 11 '18 at 18:03
It doesn't fit the context, but `¤` is the result of typing Shift + 4 using some European keyboard layouts. With a US keyboard layout Shift + 4 results in `$` (it is not unusual to have two keyboard layouts and that can be switched between by a keyboard combination such as Shift + Alt or Shift + Win - easy to accidentally hit). – Peter Mortensen Mar 09 '21 at 18:17

score 18 · Answer 2 · edited Mar 05 '21 at 04:18

18

You have invalid characters in your source. If you don't have any valid non-ASCII characters in your source, maybe in a double quoted string literal, you can simply convert your file back to ASCII with:

tr -cd '\11\12\15\40-\176' < old.c > new.c

The method with iconv will stop at wrong characters which makes no sense. The above command line is working with the example file.

edited Mar 05 '21 at 04:18

Peter Mortensen

30,738
21
105
131

answered Oct 05 '13 at 13:45

Klaus

24,205
7
58
113

score 5 · Answer 3 · edited Mar 06 '21 at 13:58

5

Sure, convert the file to ASCII and blast all Unicode characters away. It will probably work... But...

You won't know what you fixed.
It will also destroy any Unicode comments. Example: //: A²+B²=C²
It could potentially damage obvious logic and the code will still be broken, but the solution less obvious. For example: A string with "Smart-Quotes" (“ & ”) or a pointer with a full-width asterisk (＊). Now “SOME_THING” looks like a #define (SOME_THING) and ＊SomeType is the wrong type (SomeType).

Two more surgical approaches to fixing the problem:

Switch fonts to see the character. (It might be invisible in your current font)
Regular expression search all Unicode characters not part of non-extended ASCII.

In Notepad++ I can search up to FFFF, which hasn't failed me yet.

[\x{80}-\x{FFFF}]

80 is hex for 128, the first extended ASCII character.

After hitting "find next" and highlighting what appears to be empty space, you can close your search dialog and press Ctrl + C to copy to clipboard.

Then paste the character into a Unicode search tool. I usually use an online one. http://unicode.scarfboy.com/

Example:

I had a bullet point (•) in my code somehow. The Unicode value is 2022 (hex), but when read as ASCII by the compiler you get \342 \200 \242 (3 octal values). It's not as simple as converting each octal values to hex and smashing them together. So "E2 80 A2" is not the hexadecimal Unicode point in your code.

edited Mar 06 '21 at 13:58

Peter Mortensen

30,738
21
105
131

answered Jan 24 '19 at 18:01

KANJICODER

3,611
30
17

Yes, this is the kind of *comprehensive* answer that ought to be the highest voted and the accepted answer. – Peter Mortensen Mar 05 '21 at 05:01
1

Alternatively, only search/replace the offending character. E.g., using `\x{200B}` (error stray ‘\342’ ‘\200’ ‘\213’). That worked for me, after having copied code from [a web page](https://beta.docs.qmk.fm/using-qmk/advanced-keycodes/feature_macros). – Peter Mortensen Mar 05 '21 at 10:17
A [similar comprehensive answer to a similar question](https://stackoverflow.com/questions/7663565/error-stray-xxx-in-program-why/18971009#18971009) (for Linux). – Peter Mortensen Mar 06 '21 at 02:16
1

Lookup such octal UTF-8 codes to see what characters they actually correspond to, here: http://www.ltg.ed.ac.uk/~richard/utf-8.cgi?input=%C2%B0&mode=char – markling Apr 23 '21 at 09:57
(The sample QMK web page broke. [Alternative location](https://docs.qmk.fm/#/feature_macros). But I can't reproduce the problem using the new location. Perhaps they got too many complaints and fixed it?) – Peter Mortensen Mar 08 '22 at 00:15

score 4 · Answer 4 · edited Jul 26 '21 at 18:21

4

I got the same with a character that visibly appeared as an asterisk, but it was a UTF-8 sequence instead:

Encoder * st;

When compiled, it returned:

g.c:2:1: error: stray ‘\342’ in program
g.c:2:1: error: stray ‘\210’ in program
g.c:2:1: error: stray ‘\227’ in program

342 210 227 turns out to be UTF-8 for ASTERISK OPERATOR (Unicode code point U+2217).

Deleting the '*' and typing it again fixed the problem.

edited Jul 26 '21 at 18:21

Peter Mortensen

30,738
21
105
131

answered Jul 21 '16 at 09:51

TheMagicCow

396
1
10

A slightly more direct analysis is 226 136 151 (octal) → 0xE2 0x88 0x97 (hexadecimal) → UTF-8 sequence for Unicode code point U+2217 ([ASTERISK OPERATOR](https://www.utf8-chartable.de/unicode-utf8-table.pl?start=8704&number=128)). – Peter Mortensen Jul 26 '21 at 14:44
Alternatively, search/replace for `\x{2217}` in a text editor that supports regular expressions and Unicode (for example, [Geany](https://en.wikipedia.org/wiki/Geany), [Notepad++](https://en.wikipedia.org/wiki/Notepad%2B%2B), or [UltraEdit](https://en.wikipedia.org/wiki/UltraEdit)) – Peter Mortensen Jul 26 '21 at 18:26
\*That should have been: *"...342 210 227 (octal) → 0xE2 0x88 0x97 (hexadecimal)..."* (the ***decimal*** numbers were correct, but they did not match the error message directly (numbers in octal)) – Peter Mortensen May 16 '23 at 01:18

score 2 · Answer 5 · edited Mar 05 '21 at 04:34

Whenever the compiler found a special character, it gives these kind of compile errors. The error I found is as follows:

error: stray '\302' in program and error: stray '\240' in program

....

It is some piece of code I copied from a chat messenger. In Facebook Messenger, it was a special character only. After copying into the Vim editor it changed to the correct character only. But the compiler was giving the above error .. then .. that statement I wrote manually after .. it got resolved... :)

score 2 · Answer 6 · edited Mar 05 '21 at 04:36

2

It's perhaps because you copied code from the Internet (from a site which has perhaps not an ASCII encoded page, but a UTF-8 encoded page), so you can convert the code to ASCII from this site:

"http://www.percederberg.net/tools/text_converter.html"

There you can either detect errors manually by converting it back to UTF-8, or you can automatically convert it to ASCII and remove all the stray characters.

edited Mar 05 '21 at 04:36

Peter Mortensen

30,738
21
105
131

answered Feb 03 '16 at 11:14

Agrim Gupta

21
1

Yes, that is a very common occurrence. Common ones from code on web pages are [EN DASH](https://www.utf8-chartable.de/unicode-utf8-table.pl?start=8192&number=128), [EM DASH](https://www.utf8-chartable.de/unicode-utf8-table.pl?start=8192&number=128), and [MINUS SIGN](https://www.utf8-chartable.de/unicode-utf8-table.pl?start=8704&number=128) (not the same as the ASCII one - UTF-8 sequence 0xE2 0x88 0x92). They can be searched/replaced for in text editors that support regular expression by `\x{2013}`, `\x{2014}`, and `\x{2212}`, respectively. – Peter Mortensen Jul 26 '21 at 14:16

score 1 · Answer 7 · edited Mar 05 '21 at 04:27

1

This problem comes when you have copied some text from an HTML page or you have done modification in a Windows environment and are trying to compile in a Unix/Solaris environment.

Please do "dos2unix" to remove the special characters from the file:

dos2unix fileName.ext fileName.ext

edited Mar 05 '21 at 04:27

Peter Mortensen

30,738
21
105
131

answered Aug 05 '15 at 13:05

Saifu Khan

19
1

score 1 · Answer 8 · edited Mar 06 '21 at 15:19

1

Invalid character in your code.

It is a common copy-paste error, especially when code is copied from Microsoft Word documents or PDF files.

edited Mar 06 '21 at 15:19

Peter Mortensen

30,738
21
105
131

answered Apr 20 '19 at 10:25

Victor Mwenda

1,677
17
16

score 0 · Answer 9 · edited Mar 05 '21 at 04:21

Codo was exactly right on Oct. 5 that &current[i] is the intended text (with the currency symbol inadvertently introduced when the source was put into HTML (see original):

http://downloads.securityfocus.com/vulnerabilities/exploits/59846-1.c

Codo's change makes this exploit code compile without error. I did that and was able to use the exploit on Ubuntu 12.04 (Precise Pangolin) to escalate to root privilege.

score 0 · Answer 10 · edited Mar 05 '21 at 04:22

0

The explanations given here are correct. I just wanted to add that this problem might be because you copied the code from somewhere, from a website or a PDF file due to which there are some invalid characters in the code.

Try to find those invalid characters, or just retype the code if you can't. It will definitely compile then.

Source: stray error reason

edited Mar 05 '21 at 04:22

Peter Mortensen

30,738
21
105
131

answered Apr 27 '14 at 05:02

ravi

890
2
6
14

score 0 · Answer 11 · edited Mar 05 '21 at 04:41

0

With me, this error occurred when I copied and pasted code in text format to my editor (gedit).

The code was in a text document (.odt). I copied it and pasted it into gedit.

If you did the same, you have manually rewrite the code.

edited Mar 05 '21 at 04:41

Peter Mortensen

30,738
21
105
131

answered Jul 08 '14 at 15:54

Lukasavicus

137
8

Rewriting the code is not necessary. For instance, in [Notepad++](https://en.wikipedia.org/wiki/Notepad%2B%2B), you can search and replace for Unicode codepoints. E.g. \x{00A0} (identified by using a binary/hex view for the file) for a problem encountered by copying through Skype Chat. – Peter Mortensen Mar 05 '21 at 04:48

score 0 · Answer 12 · answered Nov 13 '14 at 13:19

I noticed an issue in using the above tr command. The tr command COMPLETELY removes the "smart quotes". It would be better to replace the "smart quotes" with something like this.

This will give you a quick preview of what will be replaced.

sed s/[”“]/'"'/g File.txt

This will do the replacements and put the replacement in a new file called WithoutSmartQuotes.txt.

sed s/[”“]/'"'/g File.txt > WithoutSmartQuotes.txt

This will overwrite the original file.

sed -i ".bk" s/[”“]/'"'/g File.txt

http://developmentality.wordpress.com/2010/10/11/how-to-remove-smart-quotes-from-a-text-file/

But don't the tools need to be Unicode aware for this to work (at least in the general case)? Are they? — Peter Mortensen, Mar 05 '21 at 04:50

Compilation error: stray ‘\302’ in program, etc

12 Answers12

Linked

Related