4

After I ported some legacy code from win32 to win64, after I discussed what was the best strategy to remove the warning "possible loss of data" (What's the best strategy to get rid of "warning C4267 possible loss of data"?). I'm about to replace many unsigned int by size_t in my code.

However, my code is critical in term of performance (I can't even run it in Debug...too slow).

I did a quick benchmarking:

#include "stdafx.h"

#include <iostream>
#include <chrono>
#include <string>

template<typename T> void testSpeed()
{
    auto start = std::chrono::steady_clock::now();

    T big = 0;
    for ( T i = 0; i != 100000000; ++i )
        big *= std::rand();

    std::cout << "Elapsed " << std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - start).count() << "ms" << std::endl;
}

int main()
{
    testSpeed<size_t>();
    testSpeed<unsigned int>();

    std::string str;
    std::getline( std::cin, str ); // pause

    return 0;
}

Compiled for x64, it outputs:

Elapsed 2185ms
Elapsed 2157ms

Compiled for x86, it outputs:

Elapsed 2756ms
Elapsed 2748ms

So apparently using size_t instead of unsigned int has unsignificant performance impact. But is that really always the case (it's hard to benchmark performances this way).

Does/may changing unsigned int into size_t impact CPU performance (now a 64bits object will be manipulated instead of a 32bits)?

Community
  • 1
  • 1
jpo38
  • 20,821
  • 10
  • 70
  • 151
  • 4
    Look at the generated assembly code and try to mesure the performance. – Jabberwocky May 03 '16 at 12:42
  • What makes you think that [`std::size_t`](http://en.cppreference.com/w/cpp/types/size_t) is 64-bit? In fact, its underlying type is implementation specific, and not specified by the standard. – Cory Kramer May 03 '16 at 12:43
  • @CoryKramer btw the OP mentions `size_t` and not `std::size_t` – Jabberwocky May 03 '16 at 12:46
  • Does the performance impact really *matter* if the current code is trying to stuff 64-bit values into 32-bit `int` variables? `size_t` exists for a reason. Use it. – Andrew Henle May 03 '16 at 12:46
  • 1
    Did you benchmark? What are the results? How did you check? Without the code show, the answer is "yes, it **may** impact performance". But it is not clear in which direction. – too honest for this site May 03 '16 at 12:48
  • 3
    @CoryKramer: OP states he uses Win64, which uses 64bit `size_t` and IL32. So the conclusing is correct. – too honest for this site May 03 '16 at 12:49
  • @AndrewHenle: The current code is Win32 which afaik has 32 bit `size_t`. – too honest for this site May 03 '16 at 12:51
  • @Olaf: Did not try to benchmark. Will do it. – jpo38 May 03 '16 at 12:51
  • Here is a good article -> https://web.archive.org/web/20140828142605/http://www.codeproject.com/Articles/60082/About-size_t-and-ptrdiff_t about size_t . It reasons well why you should use size_t ( or closely related ptrdiff_t). –  May 03 '16 at 13:11
  • 2
    `size_t` represents the size of objects. You shouldn't really use this for other meanings. Also, `int_fast32_t` and `uint_fast32_t` are typedefs which will resolve to some int size at least 32-bit which is the fastest for the platform. And you shouldn't mangle your code to avoid bogus warnings. If the warnings are genuine and you really do need a bigger type for your code to work correctly, then you must use the bigger type. If the warnings are bogus then ignore, disable or hide them in some other way besides changing your types. – M.M May 03 '16 at 13:11
  • Please read how to benchmark. I presume this is not your actual code. Even iff, It is not clear what your problem is. Either you need 64 bits or you don't. Either way, `size_t` is not the appropriate type to summarise values, but for array indexing. Use `unsigned int` or `unsigned long long`. If you need defined widths, use `uintN_t` (not sure where they are in C++, though). – too honest for this site May 03 '16 at 13:17
  • not sure your benchmark reveals something relevant about 64 bits integer... you likely are measuring performance of `srand()` which is of type int32_t, not int64_t. – shrike May 03 '16 at 13:18
  • @Olaf *The current code is Win32 which afaik has 32 bit `size_t`.* True, Win32 does use a 32-bit `size_t`. My point was that if the function returns `size_t`, or an interface has a `size_t` parameter, then use `size_t` and not `unsigned int` - because `size_t` is *not* `unsigned int` - it's `size_t`. Or use `ssize_t` or even `off_t`, as appropriate. – Andrew Henle May 03 '16 at 13:18
  • 2
    @M.M *If the warnings are bogus then ignore, disable or hide them in some other way* The problem with doing that comes about if later changes introduce code that *needs* the warnings you've disabled. IMO it's better to just use the larger size(s) as appropriate for the function/methods being used, and not ignore or disable the warnings because you "know" your code is correct. If the people who wrote the compiler you're using to turn your code into a runnable binary spent extra time putting in warnings because they think what you're doing is dodgy, you'd be better off listening to them. – Andrew Henle May 03 '16 at 13:21
  • @shrike: `srand` does not return anything and `rand` does not return `int32_t`, but `int`. – too honest for this site May 03 '16 at 13:23
  • @AndrewHenle OP indicated that the larger size introduces unacceptable performance problems. In some applications performance is important – M.M May 03 '16 at 13:25
  • @AndrewHenle: Agreed about using the correct type. Which would be `int` for `rand`, though which can be too small, but is guaranteed to be converted to `unsigned int` without loss. About the other types: they are no standard types, so their usage should also be restricted to where they are returned & appropriate. `ssite_t` is not the signed variant of `size_t`. – too honest for this site May 03 '16 at 13:26
  • @Olaf: sorry typo error, I meant `std::rand()`; and assuming OP's mentioned x86/x64 and windows, then `int` is `int32_t` afaik (am I wrong ?). – shrike May 03 '16 at 13:36
  • @shrike: You are wrong. It is the other way 'round: `int32_t` is `int` (although on Win, it could be as well `long`, as Win64 is IL32LLP64). `int` is the standard integer type, not `int32_t`. – too honest for this site May 03 '16 at 16:20

1 Answers1

6

Definitely not. On modern (and even older) CPUs, 64 bits integer operations perfom as fast as 32 bits operation.

Example on my i7 4600u for arithmetic operation a * b / c :

(int32_t) * (int32_t) / (int32_t) : 1.3 nsec
(int64_t) * (int64_t) / (int64_t) : 1.3 nsec

Both tests compiled for x64 target (same target as yours).

Howether, if your code manages big objects full of integers (big arrays of integers, fox example), using size_t instead of unsigned int may have an impact on performance if cache misses count increase (bigger data may exceed cache capacity). The most reliable way to check impact on performance is to test your app in both cases. Use your own type typedef'ed to either size_t or unsigned int then benchmark your application.

shrike
  • 4,449
  • 2
  • 22
  • 38
  • 1
    64bit operations slightly increase code-size on x86-64, since more instructions need a REX prefix. This is usually not significant. Silvermont has somewhat worse latency and throughput for `imul r64, r64` than `imul r32, r32`, while Atom has *much* worse performance for 64bit multiply. For simple ops like shift, add/sub, and boolean, there's no difference in the instruction timings. (See [Agner Fog's tables](http://agner.org/optimize)). And you're right that on "normal" CPUs, 64bit integer ops are full speed. – Peter Cordes May 06 '16 at 18:34
  • 1
    You're absolutely right that the main speed concern is cache footprint. It's worth looking at commonly-used data structures to see if they're still fine with a `uint32_t`. Don't worry about converting between `size_t` and `uint32_t`, though: zero-extension to 64bits happens for free on x86-64 when writing to the 32bit low half of a register. So it's fine to use `size_t` temporaries, unless the code depends on unsigned wraparound... – Peter Cordes May 06 '16 at 18:36
  • Actual 64-bit division by a value that isn't a compile-time constant is much slower with 64-bit than 32-bit integers on Intel CPUs including your Haswell. [Trial-division code runs 2x faster as 32-bit on Windows than 64-bit on Linux](//stackoverflow.com/a/52558274) and [Can 128bit/64bit hardware unsigned division be faster in some cases than 64bit/32bit division on x86-64 Intel/AMD CPUs?](//stackoverflow.com/q/56655383). Your test results are only plausible for throughput if the divisor was a constant (which is a pretty normal case). (Also signed 64-bit idiv is slower than unsigned div) – Peter Cordes Jul 23 '20 at 08:55