0

I am trying to write int61_t data to a ply file (a text file with a special header). I a piece of code that does this with a time consuming loop I am trying to speed it up. I want to avoid the time spent iterating through the array by putting my data directly into ofs.write. Can I do this?

This is what I've tried

The functional, but slow, code is as follows:

int width = point_cloud_image.get_width_pixels();
int height = point_cloud_image.get_height_pixels();
int i_max = width * height;
// get data
int16_t* point_cloud_image_data = (int16_t*)(void*)point_cloud_image.get_buffer(); 

std::stringstream ss;
for (int i = 0; i < i_max; i++) // executes 921600 times - this is the bottleneck
{  
    ss << point_cloud_image_data[3 * i + 0\] << " " << point_cloud_image_data[3 * i + 1\] << " " <<  point_cloud_image_data[3 * i + 2] << "\n";  
}

// save to the ply file
std::ofstream ofs("myfile.ply", std::ios::out | std::fstream::binary); // text mode first
ofs << "ply\n" << "format ascii 1.0\n" << "element vertex" << " " << i_max << "\n" << "property float x\n" << "property float y\n" << "property float z\n" << "end_header\n" << std::endl; 
ofs.write(ss.str().c_str(), (std::streamsize)ss.str().length());
ofs.close();

I want to avoid the time spent iterating through the array by putting my point_cloud_image_data pointer directly into ofs.write. My code to do that looks like this:

int width = point_cloud_image.get_width_pixels();
int height = point_cloud_image.get_height_pixels();
int i_max = width * height;
// get data
int16_t* point_cloud_image_data = (int16_t*)(void*)point_cloud_image.get_buffer();

// save to the ply file
std::ofstream ofs("myfile.ply", std::ios::out | std::fstream::binary); // text mode first
ofs << "ply\n" << "format ascii 1.0\n" << "element vertex" << " " << i_max << "\n" << "property float x\n" << "property float y\n" << "property float z\n" << "end_header\n" << std::endl; 
ofs.write((char*)(char16_t*)point_cloud_image_data, i_max);
ofs.close();

This is a lot faster, but now point_cloud_image_data is written in binary (the file contains characters like this: ¥ûú). How can I write the array to the text file without a time consuming loop?

Breadman10
  • 81
  • 6
  • you could try using a stream buffer to write the data to the file in chunks, rather than iterating through the entire array and writing each element separately. – Kozydot Dec 29 '22 at 20:31
  • 4
    "*How can I write the array to the text file without a time consuming loop?*" You want to write an array of integers formatted as text to a file. That's going to involve a loop. And since that loop is going to have to do integer-to-text conversion, it won't be especially *fast*. I mean, there are [faster integer conversion tools than ostream](https://en.cppreference.com/w/cpp/utility/to_chars), but you're going to have to have a loop. – Nicol Bolas Dec 29 '22 at 20:32
  • 3
    Why are you writing to `std::stringstream` first and then to a file? Try writing directly to a file (but give it some [buffer](https://stackoverflow.com/a/39387131/485343) first). And make sure to enable compiler optimizations. – rustyx Dec 29 '22 at 20:33
  • Also, this line is terribly slow: `ofs.write(ss.str().c_str(), (std::streamsize)ss.str().length());` – Nicol Bolas Dec 29 '22 at 20:33
  • 2
    According to [its Wikipedia entry](https://en.wikipedia.org/wiki/PLY_(file_format)), the PLY file format can be binary, making everything much simpler. – Costantino Grana Dec 29 '22 at 20:36
  • @rustyx I just tried writing directly to the file, and even with setting the buffer and enabling optimization in visual studio, it took longer than the code above (Which doesn't really make sense to me, but that's what happened). – Breadman10 Dec 29 '22 at 21:03
  • @NicolBolas I'll give that integer conversion a shot. I thought ofstream.write avoided looping... is there really no way I could convert the array of int16_t to an array of char without looping? – Breadman10 Dec 29 '22 at 21:14
  • @CostantinoGrana oooh, great catch! Let me see if I can read the ones that I'm generating now once I change the header... that would be fantastic! – Breadman10 Dec 29 '22 at 21:20
  • Writing data in binary should be far faster than doing int-to-string conversion except in one case: if the integer are small, then the text-based file might be smaller and thus possibly written faster on the storage device (not sure due to the separator here and small integers). That being said, modern SSD are so fast that text-based operations results in a lower throughput than what most SSD can do (especially Nvme which are *really* fast). If you have a HDD, then you cannot make this loop significantly faster: it is IO bound in that case. What kind of storage device do you use? – Jérôme Richard Dec 29 '22 at 21:43
  • 1
    @Breadman10: "*is there really no way I could convert the array of int16_t to an array of char without looping?*" *Someone* is going to have to loop. You can't do an operation on an array of something without having a loop of some kind. Even copying it out as binary required a loop; you just called a function that itself did the loop. Stop focusing on the "loop" and instead focus on how much work gets done in each iteration of the loop. The binary version was faster because it was just copying data in a loop. – Nicol Bolas Dec 29 '22 at 22:07

1 Answers1

0

Integers are stored in the computer in binary representation. To write an array of integer values to a text file, each one needs to be converted to a series of decimal digits. So you're going to need a loop. Even with buffering and compiler optimizations enabled, conversion of binary to text and back will inevitably be slower than directly working with binary data.

But if all you care about is raw performance, the PLY format can actually be binary. So your second attempt might actually work and produce a working (albeit non-human-readable) PLY file.

int width = point_cloud_image.get_width_pixels();
int height = point_cloud_image.get_height_pixels();
int i_max = width * height;

std::ofstream ofs("myfile.ply", std::ios::out | std::fstream::binary);
ofs << "ply\n"
    << (is_little_endian
        ? "format binary_little_endian 1.0\n"
        : "format binary_big_endian 1.0\n")
    << "element vertex " << i_max << "\n"
    << "property short x\n"
    << "property short y\n"
    << "property short z\n"
    << "end_header\n";
ofs.write((const char*)point_cloud_image.get_buffer(), i_max * 2 * 3);
ofs.close();

The is_little_endian check is optional and can be omitted, but it makes the code a little bit more portable.

int num = 1;
bool is_little_endian = *(char *)&num == 1;
rustyx
  • 80,671
  • 25
  • 200
  • 267