-4

What is the most efficient strategy for parsing a .STL file?

A critical part of my code is importing a .STL file, (a common CAD file format) and this is limiting performance overall.

The .STL file format is summarized here- https://en.wikipedia.org/wiki/STL_(file_format)

Using ASCII format is required for this application.

The generic format is:

solid name
    facet normal ni nj nk
        outer loop
            vertex v1x v1y v1z
            vertex v2x v2y v2z
            vertex v3x v3y v3z
        endloop
    endfacet
endsolid

However, I've noticed that there are no strict formatting requirements. And, the import function must do a minimal amount of error checking. I've done some performance measuring (using chrono) which for a 43,000 line file gives:

stl_import() - 1.177568 s

parsing loop - 3.894250 s

Parsing loop:

cout << "Importing " << stl_path << "... ";
    auto file_vec = import_stl(stl_path);
    for (auto& l : file_vec) {
        trim(l);
        if (solid_state) {
            if (facet_state) {
                if (starts_with(l, "vertex")) {

                    //---------ADD FACE----------//

                    l.erase(0, 6);
                    trim(l);

                    vector<string> strs;
                    split(strs, l, is_any_of(" "));

                    point p = { stod(strs[0]), stod(strs[1]), stod(strs[2]) };
                    facet_points.push_back(p);

                    //---------------------------//
                }
                else {
                    if (starts_with(l, "endfacet")) {
                        facet_state = false;
                    }
                }
            }
            else {
                if (starts_with(l, "facet")) {
                    facet_state = true;
                    //assert(facet_points.size() == 0);

                    //---------------------------//
                    //   Normals can be ignored  //
                    //---------------------------//

                }
                if (starts_with(l, "endsolid")) {
                    solid_state = false;
                }
            }
        }
        else {
            if (starts_with(l, "solid")) {
                solid_state = true;
            }
        }

        if (facet_points.size() == 3) {
            triangle facet(facet_points[0], facet_points[1], facet_points[2]);
            stl_solid.add_facet(facet);
            facet_points.clear();

            //check normal
            facet.normal();
        }
    }

The stl_import function is:

std::vector<std::string> import_stl(const std::string& file_path)
{
    std::ifstream infile(file_path);
    SkipBOM(infile);
    std::vector<std::string> file_vec;
    std::string line;
    while (std::getline(infile, line))
    {
        file_vec.push_back(line);
    }
    return file_vec;
}

I have searched for ways to optimize file reading, etc. And, I see that using mmap may improve file read speed.

Fast textfile reading in c++

This question is an inquiry as to what the best parsing strategy for a .STL file is?

Chris
  • 301
  • 2
  • 11
  • 1
    The best way is undoubtedly to find an appropriate library. A good parsing routine will likely yield better results than attempting to speed up read access to the file data. – Paul Rooney Dec 13 '17 at 00:51
  • I've made these edits, though they don't affect performance. – Chris Dec 13 '17 at 01:12
  • Why are you importing the entire file before you parse it? You're wasting a lot of memory and probably some time too by doing that. – Mark Ransom Dec 13 '17 at 01:31

1 Answers1

3

Without data which can be used for measuring where the time is spent it hard to determine what actually improves the performance. A decent library already doing the job may be the easiest approach. However, the current code uses a few approaches which may be easy wins to improve performance. There are things I spotted:

  1. The streams library is quite good at skipping leading whitespace. Instead of first reading spaces followed by trimming them off, you may want to use std::getline(infile >> std::ws, line): the std::ws manipulator skips leading whitespaces.
  2. Instead of using starts_with() with string literals, I'd rather read each line into a "command" and the tail of the line and compare the commands against std::string const objects: instead of a character comparison it may be sufficient to compare the size.
  3. Instead of split()ing a std::string into a std::vector<std::string> on whitespace I'd rather reset a suitable stream (probably an std::istringstream but to prevent copying possibly a custom memory stream) and read directly from that:

    std::istringstream in; // declared outside the reading loop
    // ...
    point p;
    in.clear(); // get rid of potentially existing errors
    in.str(line);
    if (in >> p.x >> p.y >> p.z) {
        facet_points.push_back(p);
    }
    

    This approach has the added advantage of allowing format checking: I always distrust any input received, even when it is from a trusted source.

  4. If you insist in using adjusting the character sequence and/or splitting it into subsequences, I'd strongly recommend using std::string_view (or, in case this C++17 class isn't available a similar class) to avoid moving characters around.
  5. Assuming the file is of a significant size, I'd recommend against reading the file into a std::vector<std::string> and then parsing it. Instead, I'd parse the file on the fly: this way the hot memory is immediately reused instead of moving it out of cache for later post-processing. This way dealing with an auxiliary stream (see point 3 above) can be avoided. To prevent an overly complex reading loop I'd split nested sections into appropriate functions, returning from them on closing tags. In addition I'd define input functions for structures like point to simply read them off the stream.
  6. Depending on the system you are working on, you may want to call std::ios_base::sync_with_stdio(false) before reading the file: there used to be at least one often used implementation of streams which would benefit from this call.
Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380