0

Let's assume there is an employee ADT, such as

//employee.h

typedef struct employee_t employee_t;

employee_t* employee_create(char* company, char* department, char* position);
void employee_free(employee_t* me);

, and client code would be

#include "employee.h"

employee_t* Kevin = employee_create("Facebook", "Marketing", "Sales");
employee_t* John = employee_create("Microsoft", "R&D", "Engineer");

Now client wanted to use list ADT to insert Kevin and John to list for some task.

//list.h

typedef struct list_t list_t;

list_t* list_create(/*might have some arguments*/);

So client code would then be

#include "employee.h"
#include "list.h"

employee_t* Kevin = employee_create("Facebook", "Marketing", "Sales");
employee_t* John = employee_create("Microsoft", "R&D", "Engineer");

list_t* employee = list_create(/*might have some arguments*/);
list_insert(employee, Kevin);
list_insert(employee, John);

employee_free(Kevin);
employee_free(John);

list_print(employee); //Oops! How to print structure that you can't see?

Because employee is encapsulated by opaque pointer, there is no way for list to copy it.

How to write ADT and implementation for list?

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • 1
    Why not just store `void*` in your list? – Gerhardh Oct 22 '21 at 10:33
  • Your question is unclear: *"How to write this list ADT and implementation?"* What is the problem that prevents you from writing this list ADT and implementation? – user694733 Oct 22 '21 at 10:33
  • This kind of depends on how generic `list_t` needs to be and if you want to do it the old school way (void pointers) or modern way (some manner of pre-processor handling of all supported types). Similar use-case examples of the different ways here: https://stackoverflow.com/a/69379395/584518 – Lundin Oct 22 '21 at 10:34
  • 1
    Why not `list_insert (sizeof employee_t, Kevin)` where you pass the size of the storage to allocate and a pointer to that storage, and `malloc()`, `memcpy()` e.g. `Kevin` to the new block and assign the new block to a `void *` pointer data-member of your list? You would still have to write the functions on how to `printf`, `remove`, `query` the `employee_t` type, but your list would work. This is the Catch-22 of C and types. While you can create a generic list and pass a *function-pointer* as an argument for list operations, you may as well just write a list for `employee_t`. – David C. Rankin Oct 22 '21 at 11:00
  • The main problem in C is that ADTs which make hard copies of the data isn't very sensible, because then you either have to provide `void*` together with size and/or some type flag enum. Or you have to provide some callback interface with function pointers. So one way to implement a sensible ADT is to simply not make any hard copies of the data, but leave such to the caller. Methods like linked lists with fragmented heap allocation all over is pretty outdated stuff anyway, since it isn't very efficient on _any_ mainstream computer from 8 to 64 bit. – Lundin Oct 22 '21 at 13:55
  • @Lundin I'm not totally understanding your comment. What does it mean by **not make any hard copies of the data, but leave such to the caller**? Do you suggest that one better provides struct implementation in header file over hiding it in .c file? – Mr.nerd3345678 Oct 24 '21 at 08:14
  • @Mr.nerd3345678 What I mean with that is to document that your ADT only uses pointers to data in the calling application and doesn't handle storage/make copies internally. – Lundin Oct 25 '21 at 06:14
  • @Ludin Sorry, I'm still not catching up. You mentioned that ADTs (linked lists with fragmented heap allocation) that make hard copies of data isn't sensible, so leave such (make hard copies of data) to the caller. However, my question is that **how does caller make hard copies of data if struct implementation is not in header file?** – Mr.nerd3345678 Oct 25 '21 at 08:50

3 Answers3

1

The usual way to do this is to have your list structure store the data as a void*. For example, assmuming your list is a singly linked list:

struct list_t
{
  void *data;
  struct list_t *next;
};

Now list_insert whould be something like this:

list_t *list_insert(list_t *head, void *data)
{
  list_t *newHead = (list_t*)malloc(sizeof(list_t));
  newHead->data;
  newHead->next = head;

  return newHead;
}

If you want to hide away the implementation of the struct then you can add methods to extract the data. For example:

void *list_get_data(list_t *head)
{
  return head->data;
}
Sean
  • 60,939
  • 11
  • 97
  • 136
0

How do you write generic list without knowing the implementation of structure?

Create functions that handle the structure abstractly.

How to write ADT and implementation for list?

list_create(); needs to pass in helper function pointers for the particular object type to perform various tasks abstractly.

  • A copy function like void *employee_copy(const void *emp) so list_insert(employee, Kevin); knows how to copy Kevin.

  • A free function like void employee_free(void *emp) so list_uninsert(employee_t) can free the list when destroyed or members removed one-by-one.

  • A print function int employee_print(void *emp) so list_print(employee_t) knows how to print each member of its list.

  • Possibly others.

Rather than pass in 3+ function pointers, consider passing in a struct that contains these pointers, then the list only needs the overhead of 1 pointer: list_create(employee_t_function_list)

You are taking your first steps toward re-writing C++

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • Alternatively you can generate those functions for a specific type, such as `list_create_employee`. Though that involves a lot of evil X-macros... – Lundin Oct 22 '21 at 12:59
  • Regarding the spoiler: And not a moment too soon! :) – Lundin Oct 22 '21 at 12:59
-4

You can use something called intrusive list. This concept is heavily used in Linux kernel. All you need is to embed the node into the struct and let the generic code operate only on this struct member.

#include <stddef.h>

struct list_node {
  struct list_node *next;
};

struct list_head {
  struct list_node *first;
};

/* translates pointer to a node to pointer to containing structure
 * for each pointer `ptr` to a `struct S` that contain `struct list_node node` member:
 *   list_entry(&ptr->node, S, node) == ptr
 */
#define list_entry(ptr, type, member) \
  (type*)((char*)ptr - offsetof(type, member))

void list_insert(struct list_head *head, struct list_node *node) {
  node->next = head->first;
  head->first = node;
}

#define LIST_FOREACH(it, head) \
  for (struct list_node *it = (head)->first; it; it = it->next)

The interface can be easily extended by other helpers like list_is_empty, list_first, list_remove_first, embed size to struct list_head.

Exemplary usage:

typedef struct {
  char *name;
  struct list_node node;
} employee_t;

typedef struct {
  char *name;
  struct list_head employees;
} employer_t;

employer_t company = { .name = "The Company" };
employee_t bob = { .name = "Bob" };
employee_t mark = { .name = "Mark" };

list_insert(&company.employees, &bob.node);
list_insert(&company.employees, &mark.node);

printf("Employees of %s:\n", company.name);
LIST_FOREACH(n, &company.employees) {
  employee_t *e = list_entry(n, employee_t, node);
  printf("%s\n", e->name);
}

Prints:

Employees of The Company:
Mark
Bob

Note that the list_* interface can easily used for other types as well.

See article for more information about using this concept for double-linked list.


Edit

Note that list_entry invokes a subtle Undefined Behavior. It is related to performing pointer arithmetics outside of the struct member object but still within a parent object. Note that any objects can be treated as an array of chars.

This code will work on all major compilers and it very unlikely to ever fail because it would break a lot of existing and heavily used code (like Linux kernel or Git).

This program is strictly conforming if struct node is a first member of the embedding struct because C standard allows safe conversion between any structure and its first member.

To be strictly conforming if node is not a first member, The issue could be circumvented by forming a pointer to struct list_node not as &bob.node but rather using a pointer arithmetics on a pointer to bob. The result would be:

(struct list_node*)((char*)&bob + offsetof(employee_t, node))

However, this syntax is really nasty, so personally I would go for &bob.node.

tstanisl
  • 13,520
  • 2
  • 25
  • 40
  • Also the `LIST_FOREACH` edit made the code far worse. Don't re-invent the C language with mysterious secret macros. – Lundin Oct 22 '21 at 11:14
  • @Lundin, only "arithmetic out-of-bounds" is technically right, this macro is variant of `container_of` macro, that is widely used in C projects. misaligned and violation of strict aliasing rules do not apply here, because pointer to `list_node` is costructed from correctly creates embedding structure – tstanisl Oct 22 '21 at 11:14
  • @Lundin, there is no reinvention here, those concepts are heavily used in most popular C projects including Git: https://github.com/git/git/blob/master/list.h, Linux kernel: https://github.com/torvalds/linux/blob/master/include/linux/list.h, – tstanisl Oct 22 '21 at 11:19
  • 1
    Hmm okay I wasn't really following the code (which in turn means that it's hard to read). Regardless, that macro is wildly dangerous. But for the love of crappy code bases stop referring to Linux as if it's something well-written and portable. – Lundin Oct 22 '21 at 11:21
  • @Lundin, it's not hard to read, it's still simple though just conceptually different from placing `void*` into list nodes. This solution advantage for heaving *less* indirection when accessing nodes. It means less cache misses and better performance. moreover it simplifies memory management because `node` is sharing ownership with the element that the node "points to". – tstanisl Oct 22 '21 at 11:24
  • I'm still not following how this magical `n` could be used to look up an item in a linked list. What type is it and where does it come from? – Lundin Oct 22 '21 at 11:26
  • Regarding cache misses, you have absolutely no control over where the `employer_t` elements end up allocated. Presumably in `.data` but there's no other guarantees. If they were allocated adjacently you could use a linked list that uses array index instead of next pointers. – Lundin Oct 22 '21 at 11:29
  • @Lundin `n` is a pointer to `struct list_node`. It's produced from `&bob.node`. Therefore `n` can be converted to `struct employee` via `list_entry` macro. I agree that there is UB related to pointer arithmetics outside of `bob.node`. However, if node was constructed as `(struct node*)((char*)&bob + offsetof(employer_t, node))` there would no *UB*. However, the syntax would be really ugly. – tstanisl Oct 22 '21 at 11:31
  • @Lundin, yes, one could use an array. But this API let delegate the memory management to the caller. This is a very important feature for low-level code. It allows use this API with both automatic, and dynamic objects. It can be used even on platforms that do not support useful `malloc()` – tstanisl Oct 22 '21 at 11:32
  • Out of bounds aside, C also makes no guarantees about coherence between those two different struct types, other than the weird "common initial sequence" rule which requires a union to be present. So it _is_ a pointer aliasing bug as well. – Lundin Oct 22 '21 at 11:34
  • @Lundin, it would be guaranteed to work if `node` was the first element of `employee_t`. In other cases it will be a technical UB, though it *will* very very likely work everywhere. There was a long discussion over this topic in https://stackoverflow.com/questions/25296019/can-a-container-of-macro-ever-be-strictly-conforming thread – tstanisl Oct 22 '21 at 11:39
  • @Lundin, btw what do you mean by "coherence between those two different struct types", which types? – tstanisl Oct 22 '21 at 11:41
  • `list_node` and `employee_t` for example. There's a special rule that lets us convert a pointer to struct to a pointer to the first member or vice versa, but that is only for the first member. – Lundin Oct 22 '21 at 11:45
  • @Lundin, `list_node` is a member of `employee_t` so "strict aliasing rule" allows objects of those type to alias. – tstanisl Oct 22 '21 at 11:48
  • *`#define list_entry(ptr, type, member) \ (type*)((char*)ptr - offsetof(type, member))`* And now you know why Linus Torvalds writes rants about strict aliasing: **because he writes and promotes *CRAP* code like that**. Yes - that's ***CRAP***. [Type-based analysis](https://en.wikipedia.org/wiki/Alias_analysis) is a whole set of optimization techniques that can't be used in the Linux and Git code bases because of ***CRAP*** like that. – Andrew Henle Oct 22 '21 at 11:50
  • It allows "member struct" to be lvalue accessed by "parent struct" but not necessarily the other way around. – Lundin Oct 22 '21 at 11:51
  • @AndrewHenle, this code has nothing to do with violation of "strict aliasing", the potential UB here is pointer arithmentics outside of `bob.node` objects while operating within `bob`. – tstanisl Oct 22 '21 at 11:52
  • @AndrewHenle This code ain't pretty but in all fairness the strict aliasing rule is pretty broken. Anyway, this is getting way off-topic. – Lundin Oct 22 '21 at 11:53
  • @tstanisl You missed my point: Torvalds writes code that "works". Until it doesn't because he willfully invokes UB. And than he publicly acts all irate when his code fails. And blames GCC. – Andrew Henle Oct 22 '21 at 11:54
  • @Lundin, changing l-value of `employee_t` object can change l-value of object pointer by some `struct list_node*`. – tstanisl Oct 22 '21 at 11:54
  • @AndrewHenle, I agree that Torvalds often promotes some controversial practices. But the concept of intrusive containers is older then Mr Torvalds. – tstanisl Oct 22 '21 at 11:56
  • @AndrewHenle, moreover `void*` like solution is not very friendly with any static analysis tools as well. The only type-super-safe solution would be using a dedicated helpers for each type, and likely using a lot of macros to avoid repeating code. – tstanisl Oct 22 '21 at 12:02
  • @AndrewHenle, this answer, follow the philosophy from https://stackoverflow.com/a/22798895/4989451, feel to comment that as well – tstanisl Oct 29 '21 at 15:21