5

I got a class that when instantiated needs to obtain a few unique ids to work. Initially I thought using an static function that assigns and increments. I don't need them to be consecutive, only unique.

class A {
    int id_1;
    int id_2;
    int id_3;
public:
    static int last_id=0;
    static int get_id(){ return A::last_id++; }
    ...
    A(){ id_1 = A::get_id(); id_2 = A::get_id(); id_3 = A::get_id(); } 
};

Now, I' thinking in going multithreading. I think the static function will be a bottleneck, since I'm constructing a few hundred thousand instances of these objects at the start. I don't destroy any instance until the end of the program, so after initialization they are fixed. Anyway they are not computed at compile time because the quantity depends of command-line arguments.

An alternative I was thinking of was using memory addresses, they are unique in a single computer at least.

Something like:

class A {
    int* id_1;
    int* id_2;
    int* id_3;
public:
    static int last_id=0;
    static int get_id(){ return A::last_id++; }
    ...
    A(){ id_1 = new int(0); id_2 = new int(0); id_3 = new int(0); } 
    ~A() { delete id_1; delete id_2; delete id_3(); }
};

Then I would read the identifiers as the address of the pointers.

Question: Does this make any sense to use pointers like this?

dvicino
  • 1,469
  • 1
  • 11
  • 19
  • 1
    They will only be unique for this process (and they might not be in order). Plus I don't see how `new` is meant to be less of a bottleneck than doing `++` on an int! – M.M Jul 31 '14 at 00:02
  • 1
    as far as i know, `new` is thread-safe, and will be just as much a bottleneck – sp2danny Jul 31 '14 at 00:03
  • 1
    There's also absolutely NOTHING enforcing memory pointers to be GUID's, unless you plan on never deallocating anything. As soon as some memory is freed, its pointer space is back up for grabs. – aruisdante Jul 31 '14 at 00:06
  • possible duplicate of [Algorithm for generating a unique ID in C++?](http://stackoverflow.com/questions/1988679/algorithm-for-generating-a-unique-id-in-c) – aruisdante Jul 31 '14 at 00:06
  • @aruisdante That what I got guaranteed saying I construct only at start and dont destroy until exiting. – dvicino Jul 31 '14 at 00:09
  • @sp2danny But, can I be as efficient as new? Maybe using Atomic or you suggest another way to be threadsafe on the increment? – dvicino Jul 31 '14 at 00:10
  • If this is effectively runtime static initialization, and you never de-allocate or re-allocate at runtime, why not simply pre-allocate the set of GUID's you're going to need, then do a simple divide-and-conquer over that span to do your object allocation in a threaded way? – aruisdante Jul 31 '14 at 00:10
  • You may use address of members of A instead of allocating memory... – Jarod42 Jul 31 '14 at 00:14
  • 6
    I feel like you're optimizing before you really know what any bottleneck is. Incrementing an integer is going to be very cheap! Suggestion: use `std::atomic` instead of `int` so it works with multiple threads. – Travis Gockel Jul 31 '14 at 00:17
  • Why not use the `this` pointer for the ID? – Neil Kirk Jul 31 '14 at 00:18
  • `void*` member made to point to itself seems quick and easy – sp2danny Jul 31 '14 at 00:19
  • BTW, Don't forget to provide correct copy constructor or forbid it. – Jarod42 Jul 31 '14 at 00:21
  • @NeilKirk: OP seems to have several unique IDs by object... – Jarod42 Jul 31 '14 at 00:22
  • @NeilKirk Because this is only one value, and I need multiple per object. Anyway, it may be safe to use this, this+ 1, this+2, this+3.. this+n if I have any guarantee that it will not pass the size of the object. (I can force it creating n variables at the start of the object I guess). – dvicino Jul 31 '14 at 00:23
  • @Jarod42 Make the ids `uintptr_t` which contain their own address. – Neil Kirk Jul 31 '14 at 00:23
  • Can't the thread-safety measures be skipped if they will only be created before all thread spawns and destroyed after all joins? – aschepler Jul 31 '14 at 00:25
  • @TravisGockel I thinking the same now. – dvicino Jul 31 '14 at 00:26
  • @aschepler yes, but with so many objects to create it might be nice to paralelize it if possible. – dvicino Jul 31 '14 at 00:28
  • 1
    @dvicino: care, `this + 1` is already outside of `this` object (is equivalent to `&(this[1])`). What you mean is `reinterpret_cast(this) + n`. The later will be outside of the object when `n >= sizeof(*this)`. – Jarod42 Jul 31 '14 at 00:32
  • @Jarod42 yeap, I tried to say that, replied fast and messed up. – dvicino Jul 31 '14 at 00:36

5 Answers5

9

You're really not that far off with your original solution. Don't prematurely optimize! Incrementing an int is very cheap! My only suggestion would be to use std::atomic<int> instead of int.

class A {
    int id_1;
    int id_2;
    int id_3;

    static int get_id() {
        static std::atomic<int> next_id(1);
        return ++next_id;
    }

public:
    A() :
        id_1(get_id()),
        id_2(get_id()),
        id_3(get_id())
    { }

    // deal with copying by disabling
    A(const A&) = delete;
    A& operator=(const A&) = delete;

    // move is okay
    A(A&&) noexcept = default;
    A& operator=(A&&) noexcept = default;
};

Assuming you don't create more than 2^31/3 instances of A, you don't have to worry about overflow.

Travis Gockel
  • 26,877
  • 14
  • 89
  • 116
  • I like your approach (it seems that we've made similar suggestions at the same moment) ! But if the unique id is an int, is there really a need to disable copying ? WOuldn't it be sufficient to just handle the special case (i.e. creating a new id for copy constructor and keeping the target id unchanged for assignment) ? – Christophe Jul 31 '14 at 00:56
  • Probably yes, but I don't know what other members @dvicino's object has (assuming `A` isn't just 3 IDs). – Travis Gockel Jul 31 '14 at 01:07
  • @TravisGockel I measured the construction of 200.000 objects using this approach and it took 0.009 seconds with -O0 compiling (in a single thread) in a 2010 pc. You was right no meaning in optimise this further. Its a one time operation and its pretty fast already. I will measure before next time. – dvicino Jul 31 '14 at 02:49
  • If you don't care about the order of ids, you can even take this one step further and use a fetch_add with relaxed memory ordering. It probably won't make any difference on x86, but some of the weaker architectures, it can speed up the increment significantly. – John5342 Jul 31 '14 at 08:37
  • And if we really want to avoid congestion, use thread local counters instead. As long as we know the number of threads this is safe and avoids any synchronization. – Voo Jul 31 '14 at 10:44
2

I've used something like the following as a quick hack in C before - when I needed some unique values that were only unique over the lifetime of the process.

constants.h

extern const void * id_1;
extern const void * id_2;

constants.c

const void * id_1 = &id_1;
const void * id_2 = &id_2;

No worry about cleanup etc since they're extern'd globals.

In C++ for a class instance you could use the same idea, but localized to the instance:

class A {
    void* id_1;
    void* id_2;
    void* id_3;
public:
    A(){ id_1 = &id_1; id_2 = &id_2; id_3 = &id_3; } 
};

Note that the ids will only be unique while the instance exists - but you've said that they're allocated only destroyed at application exit - so you should be OK.

Note: I do consider this a hack, and with C++11s std::atomic the solution provided by Travis Glockel above is as simple and more robust. However without C++11 implementing atomic or adding an atomic library is messy.

Michael Anderson
  • 70,661
  • 7
  • 134
  • 187
  • The OP has stated they'll need several hundred-thousand GUID's. – aruisdante Jul 31 '14 at 00:14
  • @aruisdante: OP can use members address. – Jarod42 Jul 31 '14 at 00:16
  • The advantage is that your trick avoids an extra new. However, if you use a member adress, your trick is equivakebt ti using `this` as a unique id, with all the issues related to loss of identify in case of copies or moves of objects – Christophe Jul 31 '14 at 00:38
  • @Christophe - the values of the ids are only setup in the default constructor - they're preserved on copy of the object. So you don't lose identity on copy. i.e. the copy will share the ids of the parent by default. – Michael Anderson Jul 31 '14 at 00:58
  • @Jarod42 the OP uses the new just to create dummy ints that are not reallocated. But you are right: in any case, the unique id (whether pointer or int) needs special handling for copy/move in order to ensure uniqueness of objects. – Christophe Jul 31 '14 at 01:01
  • @MichaelAnderson Thanks for the explanation. Then of course, it should work :-) – Christophe Jul 31 '14 at 01:04
  • @chritophe I'm not sure if that was a sarcastic smiley comment or not. So I'll explain my rationale. I'd expect a copy to share the ids. If you don't want that - then a trivial implementation of the copy constructor and assignment operators can ensure copies have unique values too. Or you can disable copying altogether... It really depends on how you expect copies to behave. – Michael Anderson Jul 31 '14 at 01:09
  • Nice idea, I never thought about externs in this way. – dvicino Jul 31 '14 at 02:51
1

I just came up with this, so I don't know if it's safe! Add as many ID members to the class as you need.

struct ID
{
    uintptr_t GetID() const
    {
        return reinterpret_cast<uintptr_t>(this);
    }
};
Neil Kirk
  • 21,327
  • 9
  • 53
  • 91
  • 1
    The disadvantage to this is your "identity" changes on copy. – Travis Gockel Jul 31 '14 at 00:28
  • @TravisGockel That may be an advantage. Surely a copy should have a different ID to its original? – Neil Kirk Jul 31 '14 at 00:29
  • Copies need to be different id in my particular case, it can be adjusted in copy constructor. – dvicino Jul 31 '14 at 00:32
  • If one day you'd store your object in a vector, the object would risk to loose its identity whenever the vector capacity is increased. – Christophe Jul 31 '14 at 00:33
  • @Christophe Then dynamically allocate the objects and disable copying and moving. It's no worse than taking the address of `this` or members which are suggested elsewhere. – Neil Kirk Jul 31 '14 at 00:35
  • If the value from that function is assigned to a constant in the construction the problem with move and vectors is solved. – dvicino Jul 31 '14 at 00:39
  • -1 dvicino: Sounds like you are trying to do something wrong. And the "each copy has a unique id" needs to be added to the question. Neil: You should test if your answer is (Thread-)safe before posting on a threadding problem – FallenAvatar Jul 31 '14 at 00:53
  • @NeilKirk I see ! Couldn't you then easily keep the object's identity despite future moves, by simply initialising the unique id with your reinterpret_cast<> in the constructor instead of returning the current address each time ? – Christophe Jul 31 '14 at 01:12
  • @TJMonk15 My experience with threading is limited, so I have no idea if it's safe. I did put a disclaimer! – Neil Kirk Jul 31 '14 at 09:40
1

The uniqueness of the id implies a bottleneck whether it is by making the counter thread safe, or relying on thread safe new.

You'd better make last_id an atomic<int> and use operator ++ to increment it as an atomic operation. I believe this will be much more efficient than all the heap management stuff that new would have to do in order to allocate a dummy int.

If this would become a real bottleneck, you could use an slightly different approach, using id generator local to each thread, and combine the local (i.e. without race) id, with the thread id (this_thread::get_id() is unique as long as the threads remain joinable), in order to have a unique id accross all the threads.

A variant of this, would be to use a static atomic int as suggested earlier, but allocate blocks of several ids at once (and manage the cached ids locally in each thread).

Christophe
  • 68,716
  • 7
  • 72
  • 138
-1

Other answers have mainly focused on alternative solutions rather than asking what the problem is with the new approach as the OP asked.

Downsides

  • You are allocating a pointer as well as the storage for an int which could hold an Id. For a few hundred thousands ids, this is 2-3 times as much storage as you need (although still quite modest).
  • Using addresses as unique ids means that you do not get a predictable ordering. For example, if you put the ids in a map<> and then walk along it, you'll get different order even when the system has the same inputs. This can lead to unpredictable behavior, and will make debugging painful.
Keith
  • 6,756
  • 19
  • 23