48

Background

I have created a basic linked list data structure mainly for learning purposes. One goal of the list was that it can handle different data structures. Therefore, I've tried my hand at structure composition to simulate "inheritance" in C. Here are the structures that form the basis for my linked list.

typedef struct Link {
    struct Link* next;
    struct Link* prev;
} Link;

typedef Link List;

In my implementation I have chosen to have a sentinel node that serves as both the head and the tail of the list (which is why Link == List).

To make the list actually handle data, a structure simply includes the Link structure as the first member:

typedef struct {
    Link link;
    float data;
} Node;

So the linked list looks like this.

         ┌───┬───┬───┐     ┌───┬───┐     ┌───┬───┬───┐     
... <--->│ P │ N │ D │<--->│ P │ N │<--->│ P │ N │ D │<---> ... 
         └───┴───┴───┘     └───┴───┘     └───┴───┴───┘
         End Node          myList        First Node

List myList;
Node node1 = {{}, 1.024};
....
Node nodeN = {{}, 3.14};

list_init(&myList) // myList.next = &myList; myList.prev = &myList;
list_append(&myList, &node1);
....
list_append(&myList, &nodeN);

Question

To traverse this list a Node pointer initially points to the First Node. It then traverses along the list until it points to the sentinel again then stops.

void traverse()
{
    Node* ptr;
    for(ptr = myList.next; ptr != &myList; ptr = ptr->link.next)
    {
        printf("%f ", ptr->data);
    }
}

My question is with the line ptr != &myList. Is there a pointer alignment problem with this line?

The for loop correctly produces the warnings: (warning: assignment from incompatible pointer type and warning: comparison of distinct pointer types lacks a cast) which can be silenced by doing what it says and casting to Node*. However, is this a DumbThingToDo™? I am never accessing ptr->data when it points to &myList as the loop terminates once ptr == &myList.

TLDR

In C structs, a Base* can point to a Derived if Base is the first member in Derived. Can a Derived* point to a Base if none of Derived specific members are accessed?

EDIT: replaced relevant function calls with their equivalent inline code.

Bector
  • 1,324
  • 19
  • 35
thndrwrks
  • 1,644
  • 1
  • 17
  • 29
  • 29
    Unrelated to the question itself, I just wanted to say the presentation is *stellar*. – WhozCraig Jan 27 '15 at 20:52
  • 1
    I don't think you'll have a pointer alignment issue, but I would ask: why not make `List` an alias for `Node` and just ignore the `data` member for `List`? Or just make it a `Node` directly (define `Node myList;` instead of `List myList;`) That would be a cleaner way to avoid the pointer casting concern. And I agree with the other comment: nice job stating the problem clearly. (+1) – lurker Jan 27 '15 at 20:54
  • `&myList` is not a `Node*`, it's a pointer to a `Node *`, or a `Node **`. I think you need a second variable set to `list_first(&myList)` or just call `list_first(&myList)` in the `for` and hope the compiler optimizes well. – Brian McFarland Jan 27 '15 at 21:03
  • I have inline list_first() and link_next() in my question. – thndrwrks Jan 27 '15 at 21:42
  • @lurker I considered defining Node myList but it just _seems_ that a list should be of type `List` ;). – thndrwrks Jan 27 '15 at 21:43
  • Specific to your example, but since you don't access *any* members of `myList` through a `Node*`, there's no alignment issue because alignment is irrelevant if a pointer isn't actually dereferenced. Just comparing pointers by value is always permitted. – Alex Celeste Jan 27 '15 at 23:41
  • As a workaround, why don't you cast both pointers to `void*` (one is sufficient, actually)? The guarantee about no padding at the beginning of the structure should guarantee that those `void*` values should compare equal IFF both contain the address of the same `Node` object, as far as I can tell. Null pointer values should also work fine as sentinels. – dyp Jan 28 '15 at 00:37
  • The standard makes dire warnings against casting different pointers or the like, but that's for bizarre architectures and aggressive optimization. In practice, the linked list node as beginning of larger struct works fine and is common place. – JDługosz Jan 28 '15 at 06:47
  • It's actually unclear what `ptr != &myList` is intended to do; it's not allowed by standard C, and gcc accepts it with a warning. Is it intended to be `(List *)ptr != &myList` or `ptr != (Node *)&myList` for example? I would suggest that the first is ok but the second could in theory violate alignment constraints, as could `ptr = ptr->link.next` (which should be `ptr = (Node *) ptr->link.next`). – davmac Mar 09 '16 at 13:23
  • Pretending that a `Base*` pointer is a `Derived*` pointer and vice-versa, isn't that a violation of the strict-aliasing rule ? – Cyan Jan 17 '19 at 07:49

3 Answers3

22

Kudos for your presentation.

I think your implementation should work fine, because C guarantees that the address of a struct is the address of its initial member. Put aside the statements C makes about alignment of struct-members, this guarantee should mean that, as long as your implementation always puts Link as first member, this should not cause alignment issues.

from here: C99 §6.7.2.1:

13 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning

This should be what you meant to say about Base * and Derived *, although no such thing exists in pure C. They are just structs which happen to have the same memory layout.

However I think it is a bit brittle to implement it like this, because Node and Link directly depend on each other. If you were to change the structure of Node, your code would become invalid. At the moment I don't see the point in having an extra struct Link, apart from you being able to just write a new Node for a new type reusing Link.

There is actually a linked list implementation that immediately came to mind when I saw your post and works in a way very similar to the way you intend to use your list: the kernel list

It uses the same List-element (list_head):

struct list_head {
    struct list_head *next, *prev;
};

It contains this function macro:

#define list_for_each_entry(pos, head, member)                          \
      for (pos = list_first_entry(head, typeof(*pos), member);        \
           &pos->member != (head);                                    \
           pos = list_next_entry(pos, member))

If you look at the way the macro is implemented, you will see that it offers iteration over the entries of a list without knowing anything about the layout of the entries the list is contained in. Assuming I interpret your intention right, I think this is the way you would want it to be.

Waqar
  • 8,558
  • 4
  • 35
  • 43
midor
  • 5,487
  • 2
  • 23
  • 52
  • 1
    Upvoted, although it has irked me for some time that C99 doesn't say what sort of conversion would be a "suitable" conversion. (Yes, it probably means "when converted to the appropriate type", but why doesn't it just say that then?) – davmac Jan 27 '15 at 22:38
  • It is very funny that you think so too. I thought about including a comment on this for quite a while. However for the OPs question, I think the fact that the pointer to the initial member and the pointer to the struct itself should be identical, is what really matters. – midor Jan 27 '15 at 23:02
  • In practice, yes. There is an issue with the head/tail in OP's question though. Because this initial `Link` object isn't part of a `Node` (unlike the rest of the list), the pointer conversion in `ptr = ptr->link.next` isn't defined (6.7.2.1p13 cannot apply). In practice, I doubt this will ever cause a problem. – davmac Jan 28 '15 at 07:30
4

In C structs, a Base* can point to a Derived if Base is the first member in Derived. Can a Derived* point to a Base if none of Derived specific members are accessed?

If you mean "can a Derived* point to an arbitrary Base (including one which is not a first member of a Derived object)", then: Technically, no. My understanding of Defect Report #74 is the alignment requirements can be different.

Q: If a structure has a field of type t, can the alignment requirements of the field be different from the alignment requirements of objects of the same type that are not members of structures? If the answer to (a) is ``yes,'' then where applicable the remaining questions should be assumed to have been asked for both objects within structures and objects outside structures.

A: Subclause 6.1.2.5 says, '... pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements.' Subclause 6.5.2.1 says, `Each non-bit-field member of a structure or union object is aligned in an implementation-defined manner appropriate to its type.' And later, 'There may therefore be unnamed padding within a structure object, ... as necessary to achieve the appropriate alignment.' a) It is possible for an implementation to state generalized requirements to satisfy sublause 6.1.2.5. These requirements may be further strengthened using the implementation-defined behavior made available in subclause 6.5.2.1. Yes, the alignment requirements can be different.

davmac
  • 20,150
  • 1
  • 40
  • 68
ouah
  • 142,963
  • 15
  • 272
  • 331
  • 2
    While this is probably true for the most standards-compliant code possible, I very much doubt there is a single practical and even semi-popular architecture in existence whose ABI actually does so. – Dolda2000 Jan 27 '15 at 21:10
  • @davmac The "unnamed padding" line is part of the introduction for all answers in the DR and I think the "unnamed padding" quote is put for the other answers of this DR that relate specifically to padding. Regarding the *stronger* requirement you mention, I see nothing that goes in that direction in the present DR, it is just written they *can be different*. – ouah Jan 27 '15 at 21:25
  • I think if Defect Report #74 does apply to the specific questions quoted, then it is wrong. Remember C99 §6.7.2.1 also includes the phrase _"and vice versa"_ and I don't see if or how qualifying the second question with "if none of Derived specific members are accessed" changes that. – Greg A. Woods Mar 07 '16 at 01:46
  • @GregA.Woods accessing a "Derived specific member" requires that a Derived object exists; the question is asking about whether a "Derived*" can "point to a Base" (no Derived object exists). I believe the qualification is there to make clear that no access occurs through the `Derived*`, and that the question is purely about whether such a pointer can exist, and not about whether it can be dereferenced. §6.7.2.1 does not say whether a cast from "Derived*" to "Base*" is always a "suitable conversion", eg. whether it necessarily grants that §6.3.2.3 alignment requirements are met. – davmac Mar 07 '16 at 13:49
  • First, that's not what the question asks. Second we must assume, given the question, that a `Derived` object is implied to exist by the fact that a pointer to it has has a non-NULL value. It does not matter if access occurs in the scope of the question or not. Also, any modern optimizer worth its salt can extrapolate that casting the pointer to point to a `Base` object, and presumably then using it to access a `Base` object clearly implies that _the_ first member of the `Derived` object is in fact being accessed since the only manipulation of the pointer is to cast it. – Greg A. Woods Mar 07 '16 at 17:54
  • And, yes, that's exactly what §6.7.2.1 does say with the words _"and vice versa"_. – Greg A. Woods Mar 07 '16 at 17:55
  • As Derek Jones says in his commentary ("The New C Standard") of that sentence: _"The only reason for preventing implementations inserting padding at the start of a structure type is existing practice (and the resulting existing code that treats the address of a structure object as being equal to the address of the first member of that structure)."_ That the alignment must be identical is conferred by other sentences. If the pointer is also not dereferenced after the conversion either, then obviously its alignment is irrelevant before and after. – Greg A. Woods Mar 07 '16 at 18:07
  • @GregA.Woods "that's not what the question asks" - I can only assume you are referring to the wrong question. The question quoted in this answer clearly asks what I said. Yes, the padding requirement states no padding at the beginning of this structure, and yes, this answer is wrong - but §6.7.2.1 and the words "vice versa" in my eyes have nothing to do with it. The OP's code doesn't cast "A pointer to a structure object", because the structure object doesn't exist. I.e. "pointer to a structure object" is not the same as "pointer with type `struct xyz *`". – davmac Mar 08 '16 at 12:04
  • (and actually, I'm not certain now that this answer is wrong. I may have misread it earlier. if OP's question is rephrased as "can the alignment requirements of Derived be stricter than those of Base" then the answer should surely be yes, and conversion of a `Base *` to a `Derived *` could indeed violate alignment. OTOH it is certainly possible to have a `Derived *` point at a `Base` object by means of membership, just not at _any arbitrary_ `Base` object). – davmac Mar 09 '16 at 13:47
1

I wrote this in the comments to ouah's answer, but I'll present it as an answer in its own right.

It is true as ouah writes that the code in your question could have an alignment problem in theory if you go by the C standard. However, I don't think there's a single architecture that your code is likely (or unlikely) to run on whose ABI exhibits such alignment patterns. At least none that I know of, for sure, and that includes experience with a few different desktop and embedded architectures. I can also not really imagine an architecture that would gain anything from such alignment properties.

It is also worth noting that this pattern is indeed used in practice quite commonly; it's kind of like the fact that you can always fit ints in pointers, in that it's not guaranteed by any cross-platform standard, but there's virtually no platform on which it's not true, and people do it all the time.

Dolda2000
  • 25,216
  • 4
  • 51
  • 92
  • "I can also not really imagine an architecture that would gain anything from such alignment properties" - consider that word size increases sometimes require corresponding alignment. On 32-bit x86, SSE2 extensions require 8-byte alignment to efficiently deal with 64-bit values; but pointers only require 4-byte alignment. It's therefore _currently_ quite possible to have non-trivial structures with differing alignment requirements. (On the other hand, of course, x86 doesn't enforce alignment on pointer load, so OP's code would likely still be fine). – davmac Mar 09 '16 at 20:57
  • @davmac: While that's true in a more general sense, I don't see that it applies to OP's case, where he only accesses `Link` members via `Node` pointers, and not the other way around. There could be alignment problems of the kind that you mention when accessing `Node` members via `Link` pointers, but OP explicitly avoids that. – Dolda2000 Mar 10 '16 at 06:03
  • I'm talking about alignment requirements, not aliasing restrictions. Regardless of access or not, `Node` might have a stricter alignment requirement than `List`, rendering `ptr = ptr->link.next` as _non-strictly-conforming_. As you say, however, this is unlikely to cause issue on most real-world architectures (which I allude to above: x86 doesn't enforce alignment on pointer load). – davmac Mar 10 '16 at 09:37
  • @davmac: That's precisely what I meant, though. `Node` may have stricter alignment requirements than `List`, but not the other way around, and since no data structure that has been aligned according `List` is accessed as `Node`, that shouldn't be a problem. No? Even though `ptr->link.next` is dereferenced via a `Link` pointer, it will still have been allocated according to `Node` alignment. – Dolda2000 Mar 14 '16 at 05:32
  • "since no data structure that has been aligned according List is accessed as Node" - not correct; check OP's question - _In my implementation I have chosen to have a sentinel node that serves as both the head and the tail of the list_; this sentinel node is defined as a `List` rather than a `Node` (`List myList;`) and is then potentially addressed via `ptr = ptr->link.next` in the for loop in the `traverse` method. – davmac Mar 15 '16 at 10:41
  • (Eck. it's not _accessed_ as a node, that's correct. But a `Node*` is coerced to point at a`List` (not `Node`) object, so alignment comes into play, since pointer conversion is then undefined if the `List` object doesn't have alignment suitable for `Node`). – davmac Mar 15 '16 at 18:13