1

Let's say, I want to use 16-bit unsigned integers to describe some category (say, as in the example below, fruits); and:

  • I want to specify a total of all these categories first, - and then choose a subset at compile time which is actually used in my program; with only the actual amount of memory used being allocated

... and, more specifically:

  • The integers/categories are not in order, i.e. there are are "holes", which raises the issue of "sparse arrays" (see related What is the best way to create a sparse array in C++?)
  • I would like to use the integers/categories as indices into an array of strings associated to the categories - so, essentially, this represents a key/value store (that is, a "dict")

Functionally, the below example more or less achieves that:

#include <stdio.h>

enum fruit         
{
  FRUIT_APPLE  = 0x1000, 
  FRUIT_ORANGE = 0x1001, 
  FRUIT_PEAR   = 0x2001,
  FRUIT_QUINCE = 0x2002,
  FRUIT_GRAPE  = 0x3000,
  /* etc. */
  FRUIT_MAX
};

// choosing which fruits to use here:
#define USE_APPLE
#define USE_ORANGE
#define USE_QUINCE

const char * const fruit_str[] =
{
  #ifdef USE_APPLE
  [FRUIT_APPLE]  = "my apple",
  #endif //USE_APPLE
  #ifdef USE_ORANGE
  [FRUIT_ORANGE] = "my orange",
  #endif //USE_ORANGE
  #ifdef USE_PEAR
  [FRUIT_PEAR]   = "special pear",
  #endif //USE_PEAR
  #ifdef USE_QUINCE
  [FRUIT_QUINCE] = "special quince",
  #endif //USE_QUINCE
  #ifdef USE_GRAPE
  [FRUIT_GRAPE]  = "epic grape",
  #endif //USE_GRAPE
  /* etc. */
};


int main() {
  printf("Size of fruit_str: %zu\n", sizeof(fruit_str));
  #ifdef USE_APPLE
  printf("Got apple: '%s'\n", fruit_str[FRUIT_APPLE]);
  #endif //USE_APPLE
  return 0;
}

A bit too many ifdefs to be cleanly readable, I guess (although one could probably use some preprocessor trickery as in, say, How to convert enum names to string in c to make it a bit more readable) - but, ultimately: my "sum total" definition of categories is in enum fruit; I choose what I want used in my program via definition of USE_(fruit) macros; and I for those "used", I can recall the associated string directly through the category integer, say fruit_str[FRUIT_APPLE].

(this kind of setup will also raise a compiler warning for two same numeric values in enum fruit (not because of restrictions on enum, but because those values are used to initialize slots in an array of strings), which I also appreciate - as the category integers are supposed to be unique)

Except, if I compile and run this program (say in https://replit.com/languages/c), I get:

> ./main
Size of fruit_str: 65560
Got apple: 'my apple'

Well, size of fruit_str of 65560 is a ... bit much, given what I actually want to store as data (and here it is "my apple", "my orange", and "special quince", which is some 32 bytes, or 35 including null terminators).

So, is there a better way to achieve this kind of a statically allocated key/value (where the keys as "sparse") store in C, so it is "optimizable" - that is, so the memory allocated correctly scales with the amount of keys chosen to be compiled in the store?

sdbbs
  • 4,270
  • 5
  • 32
  • 87
  • 1
    Well you could implement a hash map (dict). May be a bit much perhaps but it's a good learning exercise, I have done it before :) – Fredrik Mar 12 '23 at 07:11

0 Answers0