11

I have encountered problem of converting float to string where to_string is too slow for me as my data might involves few millions floats.

I already have solution on how to write those data out fast.

However, after solving that problem, I soon realized that the conversion of float to string is leaving a big impact.

So, is there any ideas or solution for this other than using other non standard library?

vitaut
  • 49,672
  • 25
  • 199
  • 336
vincent911001
  • 523
  • 1
  • 6
  • 20
  • How many numbers, what format do you want, and how long does your current implementation take on which processor? – gnasher729 Apr 21 '15 at 08:59
  • @gnasher729, Hi, actually i have about 2 millions of vertices with its coordinates of xyz in floats. As i wish to output those information to obj file, i need to convert it into string(eg: "V" + pos.x+ " "...). current implementation of to_string takes about 3 minutes on low res model with roughly 300000s vertices. FYI, i am running on old 1st gen i5 – vincent911001 Apr 21 '15 at 09:05
  • which compiler are you using? you might have some performance improvements with gcc5 which uses small string optimization instead of copy on write. Clang uses SSO as well if I'm not mistaken. – dau_sama Apr 21 '15 at 11:40
  • @dau_sama, i am using MSVC compiler from visual studio 2013 community, thanks – vincent911001 Apr 22 '15 at 00:30

2 Answers2

19

Here are some of the fastest algorithms for converting floating point numbers into decimal string representation:

At the time of writing Dragonbox is the fastest of these methods, followed by Schubfach, then a variation of Grisu called Grisu-Exact (not to be confused with Grisu2 and Grisu3) and then Ryū:

enter image description here

An implementation of Dragonbox is available here. It is also included in the {fmt} library integrated into a high-level formatting API. For maximum performance you can use format_to with a stack-allocated buffer, for example:

fmt::memory_buffer buf;
fmt::format_to(buf, "{}", 4.2);
// buf.data() returns a pointer to the formatted data & buf.size() gives the size
vitaut
  • 49,672
  • 25
  • 199
  • 336
2

An optimization that comes in mind is to not directly use to_string, which creates a new string every time you call it. You probably end up copying that string too, which is not so efficient.

What you could do is to allocate a char buffer big enough to store all the string representations that you need, then use printf

http://www.cplusplus.com/reference/cstdio/printf/

reusing the same buffer all the time. If you limit the precision of your floats to a fixed amount of decimals, you can compute the offset to which your float is represented in the array.

for example if we only had an array of values:

index = 1;
float f = value[index];
//corrresponding 6 chars float
const char* s = char_array[index*1];
//the representation will start at position 6, and it will be null terminated so you can use it as a string

for clarification your char_array will look like:

1.2000\02.4324\0...
dau_sama
  • 4,247
  • 2
  • 23
  • 30
  • Hi, sorry for my poor understanding. From what i have understood, char_array[index*1] is allocation of a big buffer, the values from all the floats will have to come into this buffer, am i correct?? – vincent911001 Apr 22 '15 at 00:49
  • Hi, dau_sama, another thing is that, how do we really know what size of buffer to be allocated?? – vincent911001 Apr 22 '15 at 01:18
  • Yes indeed, you store everything in the same buffer. If you know how many values you have, you know the size of the buffer to be allocated. If you don't know it beforehand, it can be a bit of a problem, and you'd need to grow it when you reach the limit. It just gets a bit more complicated – dau_sama Apr 22 '15 at 07:56
  • Hi, thanks for the clarification, i have managed to do it using sprintf using the buffer of char*, However, the size of the buffer to be allocated is still unknown to me. So, if the limit of buffer is reached, so how can we resize the buffer. Thanks – vincent911001 Apr 22 '15 at 08:05
  • you know the position in the buffer you're in, you know if the next value is going to overflow the buffer. What you need to do is to allocate (malloc) a bigger buffer and copy the old values to the new one. Once that's done, you can free the previous one :-) check also realloc. – dau_sama Apr 22 '15 at 08:44