1

This program is a different approach over the same old singly linked list. Instead of creating a single structure and keeping the pointer of the next node of the same type as a member of the structure, I here used three different structures, and a plain long nextaddress member in each to point the address of the next node, because pointers also do the same.

Each node also has a int flag member as the first item of the structure, and the data part resides at the end of the structure because of its variable length.

The three structures are basic extension of the built-in types, long, double and char. While accessing the structures, I first casted the address of the node as an int *, which gives me access to the flag without fully typecasting the address to a definite structure from the three.

Then analyzing the flag, various operations are done.

So here's my question. Can it be called a valid linked list? And moreover, is it even a valid data structure?

Cimbali
  • 11,012
  • 1
  • 39
  • 68
Subhranil
  • 851
  • 1
  • 8
  • 23
  • There is nothing at all wrong with a linked list node containing multiple things. In fact, this is completely normal. – Tim Biegeleisen Dec 14 '16 at 09:10
  • I guess its fine if that is what you want. However, why not just make life easier and have a pointer to the next node? If your able to store and access data without issues, then it is a valid data structure, just not a very common one. – RoadRunner Dec 14 '16 at 09:13
  • If i keep a pointer to the next node, then I'll have to keep all pointers of other structures to every structure, and even another flag to determine which is in use. That's why i used this strategy. @RoadRunner – Subhranil Dec 14 '16 at 09:19
  • @TimBiegeleisen, thank you for your answer. Can you give a practical example where this type of structure is used? – Subhranil Dec 14 '16 at 09:20
  • Sure, off the top of my head, you could have a list of some C++ class which represents a log event/object. This class could have members for the date when the log occurred, the log content, and the log level (e.g. debug, error, etc.). – Tim Biegeleisen Dec 14 '16 at 09:22

4 Answers4

2

Your code has many problems, but only referring to your specific questions about structs

Standard C grants that you solution works

6.7.2.1 Structure and union specifiers

Sub chapter 15

Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.

Emphasis mine

You should use gen_struct inside the other struct:this grants to respect the rule of "initial member".

struct gen_type {
    int flag;
    void *nextaddress;
};

struct node_type_int {
    struct gen_type header;
    long data;
};

struct node_type_real {
    struct gen_type header;
    double data;
};

struct node_type_char {
    struct gen_type header;
    char data;
};

As you can see I also changed type of nextaddress: is meant to be a pointer, so use a pointer.

Side note: do-i-cast-the-result-of-malloc

Community
  • 1
  • 1
LPs
  • 16,045
  • 8
  • 30
  • 61
  • However, standard C does not guarantee that `dataaddress = ((long)(&generic->nextaddress)) + sizeof(long);` works. Also there are pointer aliasing problems in the code. – Lundin Dec 14 '16 at 10:15
  • @Marian The questions are: _Can it be called a valid linked list? And moreover, is it even a valid data structure?_ So is not Code Review, or check my code service. I pointed out the part of OP code related to questions. – LPs Dec 14 '16 at 10:23
  • @LP In my understanding the question is talking about particular code presented in the link. Your answer could give impression that OP's code is valid while it is not. Your code on the other hand is valid. – Marian Dec 14 '16 at 10:25
  • @Lundin For sure, but I answered to OP questions: _Can it be called a valid linked list? And moreover, is it even a valid data structure?_ – LPs Dec 14 '16 at 10:30
2

It is fine to do this as long as you have a means to tell which type each node contains, like the flag variable.

It seems reasonable that you can assume that flag and nextaddress will be on the same struct offsets no matter which struct type you use. Although strictly speaking the C language does not guarantee this, I believe it will work in practice on any system out there.

However, you cannot assume that data is located at (uint8_t*)&my_struct + sizeof(int) + sizeof(long). This offset may vary between structues because of different alignment requirements.

A more serious concern is pointer aliasing. You cannot take a pointer struct* x and convert that to another pointer type struct* y. It will compile, but this is a violation of the type rules in C and invokes undefined behavior (unless both structs have exactly the same members). Standard-compliant compilers that use aggressive optimizations (like GCC) will not compile such code as expected. (What is the strict aliasing rule?)

To be on the safe side and to get better program design overall, I would recommend that you instead do this:

typedef struct node
{
  long   nextaddress;
  type_t type; // some enum
  void*  data;
} node_t;

Where data is allocated separately from a node. You get a chained linked list of sorts.

Community
  • 1
  • 1
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • I have not casted a `struct* y` to `struct* x` and vice versa. I only casted the addresses to the `gen_type`, which, according to C standard is completely permissible, and all of the four structures will follow the same memory layout for initial common members. No unnamed padding is at the beginning, and not anywhere between the `int flag` and `long nextaddress`, as both is a complete multiple of 4, which is again, to quote you, `will work on practice in every system out there`. The problem lies in the `data` part, more specifically in `node_type_char`. – Subhranil Dec 14 '16 at 10:37
  • That can also be overcome by using some calculations.
    `offset_of_data = sizeof(node) + start_address - sizeof(data)`
    To be clear, I want specifically different structures in the same list, a heterogeneous linked list if I may.
    – Subhranil Dec 14 '16 at 10:38
  • And meanwhile, the whole code is tested and compiled in GCC itself. So I'm not quite sure about aggressive optimizations. – Subhranil Dec 14 '16 at 10:45
  • @Subhranil No, according to standard C, they will have the same memory layout _for the very first member_. After that, no guarantees. The pointer aliasing problem is not just related to structs, but any data types that are not compatible. For example, you cannot take a pointer-to-int and convert it to a pointer-to-double. – Lundin Dec 14 '16 at 10:48
  • `Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.` I think I read that from a previous answer. – Subhranil Dec 14 '16 at 10:51
  • @Subhranil The problem is "There may be unnamed padding within a structure object, but not at its beginning." But as I wrote this is mainly a theoretical/formal problem, I very much doubt there will be a difference in practice. – Lundin Dec 14 '16 at 10:55
  • And to the pointer aliasing problem, let me try in the way I visualized that. Let's say the program is compiled in a 64bit GCC. So, the size of `node_type_char` is 4+8+1+13_pads = 24. So if i cast the first 32bits, I get whatever the value of the bitmap in a 32bit integer. If i cast first 64bits, I get whatever the value of the bitmap in a 64bit long, which should contain the `int` and half of the `long`. This assumption is particularly correct in my experiments, and the same I implemented in the program. – Subhranil Dec 14 '16 at 11:01
  • @Subhranil The problem is that the compiler is free to assume that the value is never accessed at all, since doing so through an incorrect pointer type would be undefined behavior. Which in turn may cause it to generate incorrect code. Check the link I posted. – Lundin Dec 14 '16 at 11:46
0

Nice. Out of the box, I liked different way of doing it! but ...

Considering memory alignments differences across don't you feel there could be issue with below code:

dataaddress = ((long)(&generic->nextaddress)) + sizeof(long);

Here you are assuming the data is stored continuously and hence calculating address of data by adding sizeof long to the address of next address. But this need not be the case always right?

Instead, I feel its better to typecast to the appropriate structure type after reading the flag value, and then read data.

Considering such challenges, though you are saving on some memory but adding some more processing. So though it looks like a valid linked list, implementation can be decided upon choice of memory vs processing.

Dipti Shiralkar
  • 362
  • 2
  • 6
  • Yup, it is the ultimate point. Although the processing is kinda difficult to imagine and implement from the programmer point of view, I don't think it'll take much overhead of the processor to perform some simple additions and addressing, compared to the leak of a few bytes in each instance. I'd rather have a compact memory at the cost of a few more seconds of processing. What say? – Subhranil Dec 14 '16 at 10:18
0

Your code has at least two weak points:

1.) conversion between long and pointer and back. The C standard allows this only if the integer type can hold the necessary range, which may not be the case on architectures with 32-bit long integers and 64-bit pointers.

I am adding quotation from C11 standard, section 6.3.2.3 Pointers:

An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.67)

Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.

67) The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment

2.) implementation supposes that fields in two different structures (where types of preceding fields match) are stored at the same place (offset) within the structure. Although this is true for most implementations, the standard guarantee this only if those two structures are part of a union.

Section 6.5.2.3 Structure and union members:

One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Tw o structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

Marian
  • 7,402
  • 2
  • 22
  • 34