0

So I'm trying to write a buffering library for the 64th time and I'm starting get into some pretty advanced stuff. Thought I'd ask for some proffesional input on this.

In my first header file I have this:

typedef struct StdBuffer { void* address; } StdBuffer;
extern void StdBufferClear(StdBuffer);

In another header file that #includes the first header file I have this:

typedef struct CharBuffer { char* address; } CharBuffer;
void (*CharBufferClear)(CharBuffer) = (void*) StdBufferClear;

Will declaring this function pointer void interfere with the call? They have matching by value signatures. I have never seen a function pointer declared void before, but its the only way to get it to compile cleanly.

Stackwise it should not make any difference at all from what I learned in assembler coding.

irrelevent OMG! I just said Stackwise on StackOverflow!

Hmm.. Looks like I've assumed too much here. Allow me to reclarify if I may. I don't care what 'type' of data is stored at the address. All that I am concerned with is the size of a 'unit' and how many units are at the address. Take a look at the interface agreement contract for the API if you will:

typedef struct StdBuffer {

    size_t width;        ///< The number of bytes that complete a data unit.
    size_t limit;        ///< The maximum number of data units that can be allocated for this buffer.
    void * address;      ///< The memory address for this buffer.
    size_t index;        ///< The current unit position indicator.
    size_t allocated;    ///< The current number of allocated addressable units.
    StdBufferFlags flags;///< The API contract for this buffer.

} StdBuffer;

You see, memcpy, memmove and the like don't really care whats at an address all they want is the specifics which I'm clearly keeping track of here.

Have a look now at the first prototype to follow this contract:

typedef struct CharBuffer {

    size_t width;        ///< The number of bytes that complete a data unit.
    size_t limit;        ///< The maximum number of data units that can be allocated for this buffer.
    char * address;      ///< The memory address for this buffer.
    size_t index;        ///< The current unit position indicator.
    size_t allocated;    ///< The current number of allocated addressable units.
    CharBufferFlags flags;///< The API contract for this buffer.

} CharBuffer;

As you an clearly see the data type is irrelevant in this context. You can say that C handles it differently depending on the case, but at the end of the day, an address is an address, a byte is byte and a long is a long for as long as we are dealing with memory on the same machine.

The purpose of this system when brought together is to remove all of this type based juggling C seems to be so proud of (and rightfully so...) Its just pointless for what I would like to do. Which is create a contract abiding prototype for any standard size of data (1, 2, 4, 8, sizeof(RandomStruct)) located at any address.

Having the ability to perform my own casting with code and manipulate that data with api functions that operate on specific length blocks of memory with specific length memory units. However, the prototype must contain the official data pointer type, because it just doesn't make sense for the end user to have to recast their data every time they would like to do something with that address pointer. It would not make sense to call it a CharBuffer if the pointer was void.

The StdBuffer is a generic type that is never EVER used except within the api itself, to manage all contract abiding data types.

The api that this system will incorporate is from my latest edition of buffering. Which is quite clearly documented here @Google Code I am aware that some things will need to change to bring this all together namely I won't have the ability to manipulate data directly from within the api safely without lots of proper research and opinion gathering.

Which just brought to my attention that I also need a Signed/Unsigned bit flag in the StdBufferFlags Members.

Perhaps the final piece to this puzzle is also in order for your perusal.

/** \def BIT(I)
    \brief A macro for setting a single constant bit.
 *
 *  This macro sets the bit indicated by I to enabled.
 *  \param I the (1-based) index of the desired bit to set.
 */
 #define BIT(I) (1UL << (I - 1))

/** \enum StdBufferFlags
    \brief Flags that may be applied to all StdBuffer structures.

 *  These flags determine the contract of operations between the caller
 *  and the StdBuffer API for working with data. Bits 1-4 are for the
 *  API control functions. All other bits are undefined/don't care bits.
 *
 *  If your application would like to use the don't care bits, it would
 *  be smart not to use bits 5-8, as these may become used by the API
 *  in future revisions of the software.

*/
typedef enum StdBufferFlags {

    BUFFER_MALLOCD = BIT(1),    ///< The memory address specified by this buffer was allocated by an API
    BUFFER_WRITEABLE = BIT(2),  ///< Permission to modify buffer contents using the API
    BUFFER_READABLE = BIT(3),   ///< Permission to retrieve buffer contents using the API
    BUFFER_MOVABLE = BIT(4)     ///< Permission to resize or otherwise relocate buffer contents using the API

}StdBufferFlags;
000
  • 26,951
  • 10
  • 71
  • 101

4 Answers4

3

This code requires a diagnostic:

void (*CharBufferClear)(CharBuffer) = (void*) StdBufferClear;

You're converting a void * pointer to a function pointer without a cast. In C, a void * pointer can convert to pointers to object types without a cast, but not to function pointer types. (In C++, a cast is needed to convert void * to object types also, for added safety.)

What you want here is just to cast between function pointer types, i.e.:

void (*CharBufferClear)(CharBuffer) = (void (*)(CharBuffer)) StdBufferClear;

Then you are still doing the same type punning because the functions are different types. You are trying to call a function which takes a StdBuffer using a pointer to a function which takes a CharBuffer.

This type of code is not well-defined C. Having defeated the type system, you're on your own, relying on testing, examining the object code, or obtaining some assurances from the compiler writers that this sort of thing works with that compiler.

What you learned in assembler coding doesn't apply because assembly languages have only a small number of rudimentary data types such as "machine address" or "32 bit word". The concept that two data structures with an identical layout and low-level representation might be incompatible types does not occur in assembly language.

Even if two types look the same at the low level (another example: unsigned int and unsigned long are sometimes exactly the same) C compilers can optimize programs based on the assumption that the type rules have not been violated. For instance suppose that A and B point to the same memory location. If you assign to an an object A->member, a C compiler can assume that the object B->member is not affected by this, if A->member and B->member have incompatible types, like one being char * and the other void *. The generated code keeps caching the old value of B->member in a register, even though the in-memory copy was overwritten by the assignment to A->member. This is an example of invalid aliasing.

Kaz
  • 55,781
  • 9
  • 100
  • 149
  • ok worst case scenario: disable optimizations for this 'static library' only. any input on that? –  Apr 24 '12 at 20:28
  • 1
    There may be finer-grade surgical instruments than "disable optimizations". For instance GCC has an option to defeat the optimizations based on strict aliasing: `-fno-strict-aliasing`. – Kaz Apr 24 '12 at 20:29
  • Now we're getting somewhere! Well, at least it sounds nice. –  Apr 24 '12 at 20:31
  • 1
    Yes; that somewhere where we are getting is "using your C compiler as an assembler". :) – Kaz Apr 24 '12 at 20:40
  • You can use a union: `union Buffer { char *pch; void *pv; }`. This aliasing is allowed. – Kaz Apr 24 '12 at 21:24
  • I've found a relavent link on this topic: -fno-strict-aliasing from a GCC "Enthusiast" - lol Mr. Linus Torvalds: https://lkml.org/lkml/2003/2/26/158 –  Apr 25 '12 at 09:05
  • It sounds to me like this should always be disabled as Linus clearly states: There is no sane way to tell the compiler when it should do aliasing. –  Apr 25 '12 at 09:06
1

The standard does not define the results of casting a function-pointer to void *.

Equally, converting between function pointers and then calling through the wrong one is also undefined behaviour.

Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680
  • That's a bug I'd be willing to take advantage of if it will replicate across ANSI C compatibles. –  Apr 24 '12 at 20:18
  • Just thinking of the amount of code it will save me from writing is like temptation straight from the "netherworld". –  Apr 24 '12 at 20:19
  • @TristonJ.Taylor: If you find yourself needing to do this, there's a good chance that there's a fundamentally better solution to whatever the high-level problem is. – Oliver Charlesworth Apr 24 '12 at 20:20
  • 2
    @TristonJ.Taylor: And of course, by definition, undefined behaviour will not replicate across different platforms. – Oliver Charlesworth Apr 24 '12 at 20:22
  • I need to be able to have that address pointer point to any valid size specifier like `char*, wchar_t, int, long, etc...` The real structure has a member called width that keeps me updated on what's actually at the address. –  Apr 24 '12 at 20:22
  • @TristonJ.Taylor: Perhaps a union would solve your problem? Another solution would be to always use a `void *` member, and cast as required (this would be perfectly well-defined). – Oliver Charlesworth Apr 24 '12 at 20:24
  • "By definition" undefined behavior has no definition, and often does replicate very nicely among platforms. – Kaz Apr 24 '12 at 20:31
  • 1
    @Kaz: "Undefined behaviour" indeed has a definition! "*behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements*"... – Oliver Charlesworth Apr 24 '12 at 20:33
  • For some reason C doesn't like me when I cast things from void* when the pointer comes from a pointer! `*(thing->address) = '\0';` Technically I believe if the code gets any more complicated than that nobody wants to take the time to understand it. –  Apr 24 '12 at 20:38
  • 1
    "Imposes no requirements" means that there is no actual definition of the behavior itself (which is why it's called undefined behavior), only of the phrase "undefined behavior". Nowhere does it say that things must break across different platform. Including `` is undefined behavior too. – Kaz Apr 24 '12 at 20:39
  • 1
    @Kaz: Fair enough! I'll rephrase: "by definition, one cannot expect/rely-on UB to replicate"... – Oliver Charlesworth Apr 24 '12 at 20:40
  • 1
    @TristonJ.Taylor: If `address` is a `char *`, that should work fine. If it's a `void *`, it won't work, and needs a cast: `*((char *)thing->address) = '\0';`. – Oliver Charlesworth Apr 24 '12 at 20:44
  • Man.. that looks like something out of wierd science. lol. Actually, looks like damned good candidate for a macro, if I need to take that route. –  Apr 24 '12 at 20:49
  • @Oli: "Another solution would be to always use a void * member, and cast as required (this would be perfectly well-defined)" - actually this is not well-defined for function pointers (though it will work for mainstream platforms). As far as the standard is concerned, function pointers can be cast to other function pointers and back - that's about it. – Michael Burr Apr 24 '12 at 21:18
  • @MichaelBurr: Of course. But my interpretation of the OP's problem is that the pointer in the struct is being used to address a contiguous buffer of stuff, so that's probably not a concern. – Oliver Charlesworth Apr 24 '12 at 21:19
0

There are some constructs which any standards-conforming C compiler are required to implement consistently, and there are some constructs which 99% of C compilers do implement consistently, but which standards-conforming compilers would be free to implement differently. Attempting to cast a pointer to a function which takes one type of pointer, into a pointer to a function which takes another type of pointer, falls into the latter category. Although the C standard specifies that a void* and a char* must be the same size, there is nothing that would require that they share the same bit-level storage format, much less the parameter-passing convention. While most machines allow bytes to be accessed in much the same way as words, such ability is not universal. The designer of an application-binary-interface [the document which specifies among other things how parameters are passed to routines] might specify that a char* be passed in a way which maximizes the efficiency of byte access, while a void* should be passed in a way that maximizes the efficiency of word access while retaining the ability to hold an unaligned byte address, perhaps by using a supplemental word to hold a zero or one to indicate LSB/MSB). On such a machine, having a routine that expects a void* called from code that expects to pass a char* could cause the routine to access arbitrary wrong data.

supercat
  • 77,689
  • 9
  • 166
  • 211
-1

No, it doesn't matter what data type is used to store the data. It only matters the type C uses to read and write that data, and that the data is of sufficient size.

Jonathan Wood
  • 65,341
  • 71
  • 269
  • 466