5

I am trying to call a C++ DLL in Java. In its C++ head file, there are following lines:

    #define a '102001'
    #define b '102002'
    #define c '202001'
    #define d '202002'

What kind of data type are for a, b, c, and d? are they char or char array? and what are the correpsonding data type in Java that I should convert to?

  • 5
    Here's the answer to your first question: http://stackoverflow.com/questions/7459939/what-do-single-quotes-do-in-c-when-used-on-multiple-characters – Mysticial Jul 20 '12 at 01:54
  • That's not syntactically valid C++. Single quotes are for individual character literals, not strings. Are you sure they're not really double quotes in your header file? – Wyzard Jul 20 '12 at 01:56
  • 2
    As stated in the question Mysticial linked to, those are multicharacter literals, their type is `int` and their value is implementation-defined. – Daniel Fischer Jul 20 '12 at 01:58
  • 2
    There *is* such a thing as "multicharacter literals" in C++, and that *is* how these #define's would be interpreted by a C++ program. Whether that's the *intent* of these #define's (and whether that's actually what you need to interface to Java) ... that's a different question. For whatever it's worth, this link might be useful: http://www.fileformat.info/info/unicode/char/102001/index.htm – paulsm4 Jul 20 '12 at 02:11

2 Answers2

5

As Mysticial pointed out, these are multicharacter literals. Their type is implementation-dependent, but it's probably Java long, because they use 48 bits.

In Java, you need to convert them to long manually:

static long toMulticharConst(String s) {
    long res = 0;
    for (char c : s.toCharArray()) {
        res <<= 8;
        res |= ((long)c) & 0xFF;
    }
    return res;
}

final long a = toMulticharConst("102001");
final long b = toMulticharConst("102002");
final long c = toMulticharConst("202001");
final long d = toMulticharConst("202002");
Community
  • 1
  • 1
Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • They use only 32 bits here (g++-4.5.1). They're truncated, since `int` is 32 bits. I get a warning "Warnung: Zeichenkonstante zu lang für ihren Typ" about that. – Daniel Fischer Jul 20 '12 at 02:08
  • @DanielFischer: oh nice, I did not even know gcc had translated its warnings :) – Matthieu M. Jul 20 '12 at 06:45
1

I might try to answer the first two questions. Being not familiar with java, I have to leave the last question to others.

Single and double quotes mean very different things in C. A character enclosed in single quotes is just the same as the integer representing it in the collating sequence(e.g. in ASCII implementation, 'a' means exactly the same as 97).

However, a string enclosed in double quotes is a short-hand way of writing a pointer to the initial character of a nameless array that has been initialized with the characters between the quotes and an extra character whose binary value is 0.

Because an integer is always large enough to hold several characters, some implementations of C compilers allow multiple characters in a character constant as well as a string constant, which means that writing 'abc' instead of "abc" may well go undetected. Yet, "abc" means a pointer points to a array containing 4 characters (a,b,c,and \0) while the meaning of 'abc' is platform-dependent. Many of the C compiler take it to mean "an integer that is composed somehow of the values of the characters a,b,and c.

For more informations, you might read the chapter 1.4 of the book "C traps and pitfalls"

House.Lee
  • 241
  • 1
  • 3
  • 12
  • Multi character constants are a feature of the language standard, so all conforming compilers must allow them. – Daniel Fischer Jul 20 '12 at 03:13
  • in which language? JAVA , C or C++? – House.Lee Jul 20 '12 at 03:17
  • C and C++. Java doesn't allow them. – Daniel Fischer Jul 20 '12 at 03:18
  • The standard document of C(ISO/IEC 9899:1999,page61) claimed as follows:The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer. The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined. – House.Lee Jul 20 '12 at 03:32
  • Yep. Since the value is ID, they're not portable, and their use is heavily discouraged. But they're part of the standard. – Daniel Fischer Jul 20 '12 at 03:41