7

Isn't it a waste of time to initialize a vector with zeros, when you don't want it?

I try this code:

#include <iostream>
#include <vector>
#include <array>

#define SIZE 10

int main()
{
#ifdef VECTOR

  std::vector<unsigned> arr(SIZE);

#else

  std::array<unsigned, SIZE> arr;

#endif // VECTOR

  for (unsigned n : arr)
    printf("%i ", n);
  printf("\n");

  return 0;
}

and I get the output:

with vector

$ g++ -std=c++11 -D VECTOR test.cpp -o test && ./test 
0 0 0 0 0 0 0 0 0 0 

with an array

g++ -std=c++11  test.cpp -o test && ./test 
-129655920 32766 4196167 0 2 0 4196349 0 1136 0 

And I also try with clang++

So why zeros? And by the way, could I declare a vector without initializing it?

Jim Lewis
  • 43,505
  • 7
  • 82
  • 96
Moises Rojo
  • 371
  • 2
  • 14
  • 1
    They were designed that way many years apart, so maybe the thinking changed. `std::array` follows "you don't pay for what you don't need" better than `std::vector`. – juanchopanza Feb 13 '18 at 20:56
  • 1
    `std::vector`: Because of the defaulted constructor parameter see (2) in: http://en.cppreference.com/w/cpp/container/vector/vector _"Constructs the container with count copies of elements with value value."_ – Richard Critten Feb 13 '18 at 20:58
  • @RichardCritten That isn't saying *why*, it just says it is so in different words. – juanchopanza Feb 13 '18 at 21:00
  • 2
    It would if you were to initialize the array `std::array<...> arr{};` – StoryTeller - Unslander Monica Feb 13 '18 at 21:01
  • 5
    @juanchopanza yes it does default initialised with `unsigned()` is default initialised with `0`. If you don't want default initialisation don't give a size and call `reserve` to get initialised space. All the (8) constructors are consistent in that they value (or copy) initialise the vector. – Richard Critten Feb 13 '18 at 21:03
  • @RichardCritten but... also is a destructive reading, right? if you print, you don't get the last values, you print an empty char – Moises Rojo Feb 13 '18 at 21:27
  • 1
    @RichardCritten Not sure if you were trying to answer my comment or someone else's. What I meant is that cppreference states the mandated behaviour as per the standard. It does not state why vector was designed to do this, against one of the design philosophies of C++. Also, `unsigned()` is not default initialization, it is value initialization. – juanchopanza Feb 13 '18 at 22:05
  • @juanchopanza lets go to the time machine, you say that don't use the default could be a better philosophy? that could be a better standard? If that was because the reason that gives Justin below, that says why is a better standard/philosophy, right...? – Moises Rojo Feb 13 '18 at 22:52
  • 1
    @MoisesRojo The design philosophy I refer to is "you don't pay for what you don't need". So if you don't need a zero-initialized array (consider vector a dynamic array) then why pay for it? Standard C++ doesn't give you an alternative for `std::vector`, but it does for `std::array`. I think it is a design flaw, but maybe it was too hard to implement non-zeroing of builtin types back in the early days of C++. – juanchopanza Feb 13 '18 at 22:56
  • ¡perfect answer! I will ever remember **you don't pay for what you don't... want/ask** But now the question is... Why not in a new c++xx standard? just compatibility? – Moises Rojo Feb 13 '18 at 23:02
  • It is possible to do exactly what you want, but the method is arcane. I posted an answer that shows how. – Jive Dadson Feb 13 '18 at 23:22

3 Answers3

5

The more common way to declare a vector is without specifying the size:

std::vector<unsigned> arr;

This doesn't allocate any space for the vector contents, and doesn't have any initialization overhead. Elements are usually added dynamically with methods like .push_back(). If you want to allocate memory you can use reserve():

arr.reserve(SIZE);

This doesn't initialize the added elements, they're not included in the size() of the vector, and trying to read them is undefined behavior. Compare this with

arr.resize(SIZE);

which grows the vector and initializes all the new elements.

std::array, on the other hand, always allocates the memory. It implements most of the same behaviors as C-style arrays, except for the automatic decay to a pointer. This includes not initializing the elements by default.

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • serious? I think almost ever you will use `size()`, and that means you need to lose the time of initializing the vector with zeros, yes or yes. right?. Because yeah `push_back()` give a defined behaviour but... I don't want use `push_back()`? – Moises Rojo Feb 13 '18 at 22:17
  • for example, how can I do something like use std::iota in a range [1, 10], without initializing the vec, and using a std or a boost function? – Moises Rojo Feb 13 '18 at 22:20
  • @MoisesRojo: Like `generate_n(back_inserter(v), 10, []{ static int x = 7; return x++; });` maybe – Zan Lynx Feb 13 '18 at 23:55
5

The default allocator is doing the zero-initialization. You can use a different allocator that does not do that. I wrote an allocator that uses default construction rather than initialization when feasible. More precisely, it is an allocator-wrapper called ctor_allocator. Then I define a vector template.

dj:vector<unsigned> vec(10); does exactly what you want. It's an std::vector<unsigned> (10) that is not initialized to zeros.

--- libdj/vector.h ----
#include <libdj/allocator.h>
#include <vector>

namespace dj {
template<class T>
    using vector = std::vector<T, dj::ctor_allocator<T>>;
}

--- libdj/allocator.h  ----
#include <memory>

namespace dj {

template <typename T, typename A = std::allocator<T>>
    class ctor_allocator : public A 
    {
        using a_t = std::allocator_traits<A>;
    public:
        using A::A; // Inherit constructors from A

        template <typename U> struct rebind 
        {
            using other =
                ctor_allocator
                <  U, typename a_t::template rebind_alloc<U>  >;
        };

        template <typename U>
        void construct(U* ptr)
            noexcept(std::is_nothrow_default_constructible<U>::value) 
        {
            ::new(static_cast<void*>(ptr)) U;
        }

        template <typename U, typename...Args>
        void construct(U* ptr, Args&&... args) 
        {
            a_t::construct(static_cast<A&>(*this),
                ptr, std::forward<Args>(args)...);
        }
    };
}
Jive Dadson
  • 16,680
  • 9
  • 52
  • 65
  • Yeah! Actually, I don't know why on my computer don't run, but in another, it runs perfectly. – Moises Rojo Feb 13 '18 at 23:37
  • [link] (https://wandbox.org/permlink/Pjkd0FlByO13k8km) Here works, but In my fedora 27, gcc 7.3 don't works – Moises Rojo Feb 13 '18 at 23:46
  • Do you have any suggestion? – Moises Rojo Feb 13 '18 at 23:46
  • Comments are not for solving programming problems.Perhaps my code is using a C++11 feature not supported by GCC 7.3. I do not use GCC or know anything about it. If you can't figure it out, post it as a new question. – Jive Dadson Feb 13 '18 at 23:48
  • How does it not work in GCC? Doesn't compile? Crashes? If it compiles and does not crash, it is almost certainly working. It is quite possible that the vector memory will occasionally already contain zeros "by accident." – Jive Dadson Feb 14 '18 at 00:14
  • @Lightness - It's all legit. – Jive Dadson Feb 14 '18 at 00:31
  • @JiveDadson: Are you sure? Because if it gives completely different results in GCC than your compiler (Visual Studio?) then it seems to me either your program has UB, or it relies on a vendor extension. A compiler bug is not out of the question, but if your code is "all legit" and uses only 7-year old language features then it is unlikely. – Lightness Races in Orbit Feb 14 '18 at 00:37
  • @Lightness - Yes I am sure. Moises has not said what the results are that he's getting on his machine. Syntax error? Crash? He has not said. It works on that wandbox link when you select GCC C++11. – Jive Dadson Feb 14 '18 at 00:40
  • the program compiles and doesn't crash, but initialize the vector with zeros. And actually in **wandbox** if the vector has a size (75 >=) dj::vector initialize the vector with zeros. – Moises Rojo Feb 14 '18 at 03:45
  • In my PC, doesn't matter the size, initialize the vector with zeros. – Moises Rojo Feb 14 '18 at 03:45
  • 1
    Well, the allocator is not doing (or re-doing) the initialization to zeros. Just because memory contains zeros does not mean they were put there when the vector was constructed. Probably the OS clears process memory at the beginning, and your test code is using memory that has never been used since the process started. The reason the Wandbox version has zeros for 75 and up but not lower is undoubtedly that when you allocate a small amount you are getting memory that has been used before, but not so when you allocate the larger amount. – Jive Dadson Feb 14 '18 at 03:52
3

Suppose we have some class:

class MyClass {
    int value;

public:
    MyClass() {
        value = 42;
    }
    // other code
};

std::vector<MyClass> arr(10); will default construct 10 copies of MyClass, all with value = 42.

But suppose it didn't default construct the 10 copies. Now if I wrote arr[0].some_function(), there's a problem: MyClass's constructor has not yet run, so the invariants of the class aren't set up. I might have assumed in the implementation of some_function() that value == 42, but since the constructor hasn't run, value has some indeterminate value. This would be a bug.

That's why in C++, there's a concept of object lifetimes. The object doesn't exist before the constructor is called, and it ceases to exist after the destructor is called. std::vector<MyClass> arr(10); calls the default constructor on every element so that all the objects exist.

It's important to note that std::array is somewhat special, since it is initialized following the rules of aggregate initialization. This means that std::array<MyClass, 10> arr; also default constructs 10 copies of MyClass all with value = 42. However, for non-class types such as unsigned, the values will be indeterminate.


There is a way to avoid calling all the default constructors: std::vector::reserve. If I were to write:

std::vector<MyClass> arr;
arr.reserve(10);

The vector would allocate its backing array to hold 10 MyClasss, and it won't call the default constructors. But now I can't write arr[0] or arr[5]; those would be out-of-bounds access into arr (arr.size() is still 0, even though the backing array has more elements). To initialize the values, I'd have to call push_back or emplace_back:

arr.push_back(MyClass{});

This is often the right approach. For example, if I wanted to fill arr with random values from std::rand, I can use std::generate_n along with std::back_inserter:

std::vector<unsigned> arr;
arr.reserve(10);
std::generate_n(std::back_inserter(arr), 10, std::rand);

It's also worth noting that if I already have the values I want for arr in a container, I can just pass the begin()/end() in with the constructor:

std::vector<unsigned> arr{values.begin(), values.end()};
Justin
  • 24,288
  • 12
  • 92
  • 142
  • 3
    I'm surprised this got selected, because it begs the question "why doesn't `std::array` behave like this? Why can't `std::vector` behave like `std::array`? And that question sounds pretty much like what OP is asking. – juanchopanza Feb 13 '18 at 22:34
  • @juanchopanza That's true, but I'm not sure I want to go into that in this answer. I intended to give the OP some intuition on why `std::vector` behaves in this way. – Justin Feb 13 '18 at 22:36
  • 1
    Question... "Why...?" R: Because vector call the default constructor, It's implicit (or I think) that array don't call it, that solve the question – Moises Rojo Feb 13 '18 at 22:37
  • BTW @Justin Do you know how to initialize a vector n-dimensions with `beginend` (is `begin()/end()` in emacs)? like this [link](https://stackoverflow.com/questions/48013802/how-to-set-a-vector-of-discrete-distribution-c/48740542#48740542) – Moises Rojo Feb 13 '18 at 22:43
  • @juanchopanza Thinking a bit more on it, I edited to include a short explanation for why `std::array` behaves the way it does. The behavior is indeed odd given only the explanation I gave before – Justin Feb 13 '18 at 22:44
  • @MoisesRojo Usually I would recommend using `std::vector matrix` even for multidimensional things. Instead, use a wrapper type, or use a helper function such as `index_from_2d(int x, int y, int width, int height) { return y * width + x; }`. Then you'd write `matrix[index_from_2d(x, y)]` to access elements. This is much more efficient. – Justin Feb 13 '18 at 22:46
  • This answer does not address the question. Why is the behavior different for *primitive* types? Why does `vector x(100)` zero initialize but not `array x` does not? – xyz Sep 26 '20 at 23:20