12

I ported some legacy code from win32 to win64. Not because the win32 object size was too small for our needs, but just because win64 is more standard now and we wish to port all our environments to this format (and we also use some 3rd party libs offering better performance in 64bits than in 32bits).

We end up with tons of;

warning C4267: 'argument': conversion from 'size_t' to '...', possible loss of data

Mainly due to code like: unsigned int size = v.size(); where v is a STL container.

I know why the warning makes sense, I know why it is issued and how it could be fixed. However, in this specific example, we never experienced cases where the container size exceeded unsigned int's max value in the past.... so there will be no reason for this problem to appear when code is ported to 64bits environment.

We had discussions on what would be the best strategy to supress those noisy warnings (they may hide a relevant one we will miss), but we could not make a decision on the apropriate strategy.

So I'm asking the question here, what would be the best recommended strategy?

1. Use a static_cast

Use a static_cast. Do unsigned int size = static_cast<unsigned int>(v.size());. I don't "like" that because we loose the 64bits capability to store a huge amount of data in a container. But as our code never reached the 32bits limit, so this appears to be a safe solution...

2. Replace unsigned int by size_t

That's definitely harder as unsigned int size object in the example above could be pased to other functions, saved as class attribute and then removing a one-line warning could end up in doing hundreds of code change...

3. Disable the warning

That's most likely a very bad idea as it would also disable warning in this case uint8_t size = v.size() which is definitely likely to cause loss of data....

4. Define a "safe cast"* function and use it

Something like:

template <typename From, typename To> To safe_cast( const From& value )
{
    //assert( value < std::numeric_limits<To>::max() && value > std::numeric_limits<To>::min() );
    // Edit 19/05: test above fails in some unsigned to signed cast (int64_t to uint32_t), test below is better:
    assert(value == static_cast<From>(static_cast<To>(value))); // verify we don't loose information!
    // or throw....
    return static_cast<To>( value ); 
}

5. Other solutions are welcome...

"Use solution 1 in this case but 2 in this case" could perfectly be a good answer.

Community
  • 1
  • 1
jpo38
  • 20,821
  • 10
  • 70
  • 151
  • 3
    Using ```size_t``` would have been the correct solution directly from the beginning. I would also go for it and while you are porting it, use ```size_t``` correctly at last. – Jan Henke Apr 25 '16 at 07:57
  • @JanHenke: True. That's actually the advantage of solution 2...problem will be fixed for good...even when we will move to win128 ;-) But it is such a pain to update all the code.... – jpo38 Apr 25 '16 at 07:58
  • @JanHenke: No, that's too simple to say. I have the same warning all the time, e.g. from doing an FFT on audio data. Trust me, I will never have audio frames lasting 27 hours, with 10 uHz frequency resolution. 0.7 Hz resolution (44 kHz/65536) is already close to perfect. So, should I give up storin my FFT results in `std::vector`? No, even for collections that are strictly smaller than 64K elements, vector is still the best choice. – MSalters Apr 25 '16 at 08:10
  • @MSalters strictly speaking ```std::vector``` will return a value of type ```std::vector::size_type```, which is generally a synonym for ```std::size_t```. It is formally correct that using ```size_t``` is the correct type to use, it is defined exactly for this use case. It is also used like that consistently throughout the standard library. It is not too simple to say. Also your description is much too vague to really understand what the problem in your case might be. – Jan Henke Apr 25 '16 at 08:18
  • 2
    Bite the bullet and change the code to use `size_t`. Anything else will be a bit of a code smell. – Sean Apr 25 '16 at 08:22
  • @JanHenke: The problem is simple: my problem domain has relatively small amounts of data, small enough that any integer type will be sufficient. These lengths appear all over the place, not just in relation to `std::vector`, so when I **do** get a length from a `std::vector` I will get this warning. And no, just changing everything to `size_t` isn't magically going to solve all warnings. For instance, a 32 bits int can be safely cast to a 64 bits double without lack of precision. – MSalters Apr 25 '16 at 08:24
  • @MSalters still, use ```std::size_t``` whenever you store the index or size of something. Then you will also not get this warning. What you are describing is, that you are trying to use another int type to store sizes and then get the warning, as it discards the higher bits of the result. While you might know that you vector never gets that big that you need a 64bit type, it is still formally defined that way. If you want to cut corners to save memory, you can, but you need to be consistent. This still does not make my comment less correct. Using ```std::size_t``` always is *the* answer. – Jan Henke Apr 25 '16 at 08:36
  • @Niall: Fixed the `safe_cast` second comparison. `throw` or `assert` depends on how serious you want the notification to be. – jpo38 Apr 25 '16 at 08:37
  • @jpo38. Sure, or both (so debug builds allow for a break at that point). Either way is was just a note of the comparisons required. I actually like that "cast" with the checks. – Niall Apr 25 '16 at 08:42
  • In light of your other thread about performance problems with `size_t`, perhaps you'd be better off using `uint32_t` along with something like option 4. To debug bogus implicit assignments you could wrap the int in a class without implicit conversions (and use a debug mode typedef or something to undo the changes when you've assured yourself there are none) – M.M May 03 '16 at 13:19

2 Answers2

11

Use the correct type (option 2) - the function/interface defines that type for you, use it.

std::size_t size = v.size(); // given vector<>::size_type is size_t
// or a more verbose
decltype(v)::size_type size = v.size();

It goes to the intent... you are getting the size of v and that size has a type. If the correct type had been used from the beginning, this would not have been a problem.

If you require that value later as another type, transform it then; the safe_cast<> is then a good alternative that includes the runtime bounds checking.

Option 6. Use auto

When you use size = v.size(), if you are not concerned what the type is, only that you use the correct type,

auto size = v.size();

And let the compiler do the hard work for you.

Niall
  • 30,036
  • 10
  • 99
  • 142
  • defenitly auto is the way to go, why being masochist and use other ways? – David Haim Apr 25 '16 at 08:38
  • @DavidHaim. Yep, I would go with `auto`. Option 2 would be the best of those he is already considering, but `auto` is easier (IMO). – Niall Apr 25 '16 at 08:39
  • Thanks for this 6's option. But some part of our code is still compiled in 32bits with old compilers not supporting C++11 (we link with 3rd party .lib files only available in 32bits).....we'll need to work that out one day too....;-) – jpo38 Apr 25 '16 at 08:41
  • @jpo38 - the bane of compatibility. Failing `auto`, the I would stick to `std::size_t` - I know technically that the `.size()` calls may not return a `size_t`, but you can always asset/check for that in the compilation (and in general they are anyway). – Niall Apr 25 '16 at 08:45
  • 2
    @Niall: Definitely, I think we will roll up our sleeves and change all `unsigned int` to `size_t`.... – jpo38 Apr 25 '16 at 08:55
0

IFF you have time pressure to get the code warning free, I would initially disable the warning -- your code used to work with this, and it is, IMHO, extremely unlikely that in cases where you assign to a 32bit size you'll exceed it. (4G values in a collection - I doubt that'll fly in normal applications.)

That being said, for cases other than collections, the warning certainly has merit, so try to get it enabled sooner or later.

Second when enabling it and fixing the code, my precedence would be:

  • Use auto (or size_t pre C++11) for locals where the value is not further narrowed.
  • For when narrowing is necessary, use your safe_cast if you can justify the overhead of introducing it to your team. (learning, enforcing, etc.)
  • Otherwise just use static_cast:

    I do not think this is a problem with collections. If you do not know better to the contrary, your collection will never have more than 4G items. IMHO, it just doesn't make sense to have that much data in a collection for any normal real world use cases. (that's not to say that there may not be the oddball case where you'll need such large data sets, it's just that you'll know when this is the case.)

    For cases where you actually do not narrow a count of a collection, but some other numeric, the narrowing is probably problematic anyway, so there you'll fix the code appropriately.

Martin Ba
  • 37,187
  • 33
  • 183
  • 337