What I previously wrote in a comment:
I first saw this when learning about XAtom
s (Yepp. That's decades ago but I still like the concept with the atom table and use it when it's appropriate) but symbol tables in compilers might do this similar:
You may introduce a std::vector<const char*>
for unique storage of names bundled with a std::map<const char *, size_t>
for reverse mapping of indices to names.
I use const char*
instead of std::string
because I'm afraid of that std::vector::resize()
may copy the strings and invalidate the C-string addresses which I use for reverse mapping (to prevent duplicated storage of strings).
Hence, OP's struct A
could refer to an index in this table. Whether or not duplicated names are allowed is a design decision beyond this atom table.
A sample program (sketch) to demonstrate how it could look like:
#include <cstring>
#include <iostream>
#include <map>
#include <string>
#include <vector>
// atom table to store unique strings
class AtomTable {
private:
// storage of strings (mapping of indices to strings)
std::vector<const char*> _names;
// predicate for lexicographical order of names
struct LessName {
bool operator()(const char *name1, const char *name2) const
{
return strcmp(name1, name2) < 0;
}
};
typedef std::map<const char*, size_t, LessName> Map;
// mapping of strings to indices
Map _map;
static const char* strdup(const std::string &name)
{
char *name2 = new char[name.size() + 1];
strcpy(name2, name.c_str());
return name2;
}
public:
AtomTable() = default;
~AtomTable()
{
_map.clear();
for (const char *&name : _names) delete[] name;
}
AtomTable(const AtomTable&) = delete; // important to disable this
AtomTable& operator=(const AtomTable&) = delete; // important to disable this
size_t find(const std::string &name) const
{
Map::const_iterator iter = _map.find(name.c_str());
return iter != _map.end()
? iter->second // index of name in table
: (size_t)-1; // invalid index
}
size_t get(const std::string &name)
{
size_t i = find(name);
if (i < _names.size()) return i;
// not yet in table -> add
i = _names.size();
_names.push_back(strdup(name));
_map[_names.back()] = i;
return i;
}
const char* get(size_t i) const
{
return _names[i];
}
size_t size() const { return _names.size(); }
};
// a singleton atom table
static AtomTable& atomTable()
{
// like learnt from Meyer:
// https://stackoverflow.com/q/1661529/7478597
static AtomTable atomTable;
return atomTable;
}
// a sample structure
struct A {
size_t iName;
int payload;
A(const std::string &name, int payload):
iName(atomTable().get(name)), payload(payload)
{ }
~A() = default;
A(const A&) = default;
A& operator=(const A&) = default;
};
std::ostream& operator<<(std::ostream &out, const A &a)
{
return out << "A { name: '" << atomTable().get(a.iName)
<< "' (" << a.iName << "), payload: " << a.payload << " }";
}
int main()
{
std::vector<A> aTable = {
{ "first", 1 },
{ "second", 2 },
{ "third", 3 },
{ "first", 4 }
};
size_t i = 0; for (const A &a : aTable) {
std::cout << i << ".: " << a << '\n'; ++i;
}
std::cout
<< "About Atom Table:\n"
<< "Number of entries: " << atomTable().size() << '\n'
<< "'first' in table?: "
<< (atomTable().find("first") < atomTable().size() ? "yes" : "no")
<< '\n'
<< "'other' in table?: "
<< (atomTable().find("other") < atomTable().size() ? "yes" : "no")
<< '\n';
return 0;
}
Output:
0.: A { name: 'first' (0), payload: 1 }
1.: A { name: 'second' (1), payload: 2 }
2.: A { name: 'third' (2), payload: 3 }
3.: A { name: 'first' (0), payload: 4 }
About Atom Table:
Number of entries: 3
'first' in table?: yes
'other' in table?: no
Live Demon on coliru
Note:
Please, bear with me that I didn't do any error checking. This is surely something which has to be added to productive code.
A std::unordered_map
might be used instead of a std::map
. In any case, it's important to provide a predicate which compares the contents of the C-strings not the addresses as how it would be done with e.g. std::less
in std::map
.
As it is, names (atoms) are added only – they are intentionally never removed before destruction of AtomTable
. If removal would be required, I would add some kind of reference counting as well. One of the advantage of Atom
s (beside of the fact that comparing names is very cheap – an integer comparison) is that they can be shared with minimal footprint (an index). Due to this, removing them would be dangerous without proper reference counting. (Not to mention, that managing the unused indices in name storage would become a topic as well.)