3

I have an existing 1D array, is memset the fastest way to zero it?

Shibli
  • 5,879
  • 13
  • 62
  • 126
  • Related: http://stackoverflow.com/questions/8528590/what-is-the-advantage-of-using-memset-in-c – Mysticial Jan 03 '12 at 19:36
  • 1
    Yes, it's the fastest by far. All of the mem... guys are fast because they understand how to set destination words as well as destination bytes. That is, when, for example, four zero bytes are to be moved into a memory word, memset clears the memory location all in one go. As a bonus, memset and memmove and memcopy are portable. – Pete Wilson Jan 03 '12 at 19:37
  • `memset` will set all the bits to 0 but that may not always be what you want to happen. Do you care about portability? What's in the array? – David Heffernan Jan 03 '12 at 19:38
  • may be duplicate of: http://stackoverflow.com/questions/1373369/which-is-faster-preferred-memset-or-for-loop-to-zero-out-an-array-of-doubles – Adrian Jan 03 '12 at 19:50
  • Only if you can't be sure that it's already zeroed. If you can be sure, `memset` is not fastest. :) – Drew Dormann Jan 03 '12 at 19:52
  • A 1 dimensional array of what. **Type is important**. – Martin York Jan 03 '12 at 19:57
  • @Heffernan, portability is not important. Array is a float array. – Shibli Jan 03 '12 at 20:01
  • What are you talking about? A "C"-Stale array (float[...])? Or a C++ vector? It makes a big difference. – RED SOFT ADAIR Jan 03 '12 at 20:02

2 Answers2

4

Fastest ... probably yes. Buggy almost sure!

It mostly depends on the implementation, platform and ... what type the array contains.

In C++ when a variable is defined its constructor is called. When an array is defined, all the array's elements' constructors are called.

Wiping out the memory can be considered "good" only for the cases when the array type is know to have an initial state that can be represented by all zero and for which the default constructor doesn't perform any action.

This is in general true for built-in types, but also false for other types.

The safest way is to assign the elements with a default initialized temporary.

template<class T, size_t N>
void reset(T* v)
{
    for(size_t i=0; i<N; ++i) 
        v[i] = T();
}

Note that, if T is char, the function instantiates and translates exactly as memset. So it is the same speed, no more no less.

Drew Noakes
  • 300,895
  • 165
  • 679
  • 742
Emilio Garavaglia
  • 20,229
  • 2
  • 46
  • 63
  • Platform : Windows, Type: float, Size: 1e6. Also, run-time speed is the first priority, even important than safety. – Shibli Jan 03 '12 at 20:08
  • 1
    @Shilbli: may be the above template can be even better: memset sets bytes. my functions sets floats, that have the same size of a processor word. If the compiler has a good optimization (pacing i in a register and keeping T() as an outside loop constant) may be even faster than a non-specialized memset! But it mostly depends on the compiler, not the library. – Emilio Garavaglia Jan 03 '12 at 20:18
  • This can be done with the Standard Library by using `std::fill_n`. – Blastfurnace Jan 03 '12 at 21:17
  • @Blastfurnace: there is a small difference: `std::fill_n` takes the size as a runtime-parameter, here it is a compile-time constant. The compiler can optimize more on this case, for example by unrolling or parallelizing the loop. – Emilio Garavaglia Jun 16 '13 at 06:40
3

This is impossible to know because it's implementation specific. Generally though, memset will be the fastest because the library implementers have spent a lot of time optimising it to be very fast, and sometimes the compiler can do optimisations on it that can't be done on hand-rolled implementations because it knows the meaning of memset.

Seth Carnegie
  • 73,875
  • 22
  • 181
  • 249
  • 1
    Not to mention intrinsics if supported. – Captain Obvlious Jan 03 '12 at 19:38
  • For very large arrays `calloc` may be faster, taking advantage of storage management features of the target machine. But no way to know without digging into the messy internals. – Hot Licks Jan 03 '12 at 19:42
  • @HotLicks he did say "an existing array" – Seth Carnegie Jan 03 '12 at 19:44
  • Yeah, but in C/C++, the definition of "existing array" is fairly broad. – Hot Licks Jan 03 '12 at 19:46
  • @HotLicks doesn't `calloc` allocate an array/block of memory? – Seth Carnegie Jan 03 '12 at 19:49
  • @SethCarnegie: "the library implementers have spent a lot of time optimising it to be very fast". Please ... don't aliment false mith: memset is notjong more tha na loop in machine code. That can be just an instruction is the machine has it. It's not "optimization obtained by programmes with lot of work". It just what the processor is from the time of Mr Federico Faggin. – Emilio Garavaglia Jan 03 '12 at 19:58
  • @Hot Licks, size of array: 1e6 (a million) float. – Shibli Jan 03 '12 at 20:04
  • @Shibli -- For a large array using calloc may very well be faster than memset. To use memset the storage involved must usually be physically cleared, but with calloc the system can (sometimes) allocate the pages and simply mark them to be cleared on first reference, saving all that paging. (Of course, a reasonably smart system may figure out how to harness the same logic storage management logic for memset; hence you really need to know some low-level implementation details.) (1MB is probably at about the boundary between the various options.) – Hot Licks Jan 03 '12 at 20:30
  • 1
    @EmilioGaravaglia -- A competent developer of memset will be cognizant of the cache line size of the machine, and may be able to take advantage of "clear cache line" operations, etc. Generally the front end would clear in a "conventional" fashion up to a cache (or even page) boundary, then there would be a loop using the hardware-specific clear functions, until one gets to the trailing cache/page boundary, at which point "conventional" clearing resumes. – Hot Licks Jan 03 '12 at 20:36