10

We have a C++ project in which there are several large static data tables (arrays of structs) generated by an preprocessing tool and compiled into our project. We've been using VC++ 2008 up to now, but are preparing to move to 2010, and these data tables are suddenly taking a very long time to compile.

As an example, one such table has about 3,000 entries, each of which is a struct containing several ints and pointers, all initialized statically. This one file took ~15 seconds to compile in VC++ 2008, but is taking 30 minutes in VC++ 2010!

As an experiment, I tried splitting this table evenly into 8 tables, each in its own .cpp file, and they compile in 20-30 seconds each. This makes me think that something inside the compiler is O(n^2) in the length of these tables.

Memory usage for cl.exe plateaus at around 400 MB (my machine has 12 GB of RAM), and I do not see any I/O activity once it plateaus, so I believe this is not a disk caching issue.

Does anyone have an idea what could be going on here? Is there some compiler feature I can turn off to get back to sane compile times?

Here is a sample of the data in the table:

//  cid (0 = 0x0)
{
    OID_cid,
    OTYP_Cid,
    0 | FOPTI_GetFn,
    NULL,
    0,
    NULL,
    (PFNGET_VOID) static_cast<PFNGET_CID>(&CBasic::Cid),
    NULL,
    CID_Basic,
    "cid",
    OID_Identity,
    0,
    NULL,
},

//  IS_DERIVED_FROM (1 = 0x1)
{
    OID_IS_DERIVED_FROM,
    OTYP_Bool,
    0 | FOPTI_Fn,
    COptThunkMgr::ThunkOptBasicIS_DERIVED_FROM,
    false,
    NULL,
    NULL,
    NULL,
    CID_Basic,
    "IS_DERIVED_FROM",
    OID_Nil,
    0,
    &COptionInfoMgr::s_aFnsig[0],
},

//  FIRE_TRIGGER_EVENT (2 = 0x2)
{
    OID_FIRE_TRIGGER_EVENT,
    OTYP_Void,
    0 | FOPTI_Fn,
    COptThunkMgr::ThunkOptBasicFIRE_TRIGGER_EVENT,
    false,
    NULL,
    NULL,
    NULL,
    CID_Basic,
    "FIRE_TRIGGER_EVENT",
    OID_Nil,
    0,
    NULL,
},

//  FIRE_UNTRIGGER_EVENT (3 = 0x3)
{
    OID_FIRE_UNTRIGGER_EVENT,
    OTYP_Void,
    0 | FOPTI_Fn,
    COptThunkMgr::ThunkOptBasicFIRE_UNTRIGGER_EVENT,
    false,
    NULL,
    NULL,
    NULL,
    CID_Basic,
    "FIRE_UNTRIGGER_EVENT",
    OID_Nil,
    0,
    NULL,
},

As you can see, it includes various ints and enums as well as a few literal strings, function pointers and pointers into other static data tables.

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
Nathan Reed
  • 3,583
  • 1
  • 26
  • 33
  • As a workaround, you might try this hack for implementing arrays without arrays that I wrote a long time ago: http://bytes.com/topic/c/answers/527211-code-puzzle-implementing-arrays-without-arrays-heap – Joseph Garvin Oct 25 '11 at 18:15
  • Have you determined if this time is spent in actual compilation, or in linking, or distributed between the two? – Michael Price Oct 25 '11 at 18:20
  • 2
    Thanks @JosephGarvin, but I'm pretty sure nesting templates 3,000 deep is not going to improve our compile times. :) – Nathan Reed Oct 25 '11 at 18:20
  • @MichaelPrice It's just compilation that is taking ridiculously long, and just for these large data tables (not for ordinary .cpp files with actual code). Linking our project is no worse in VS2010 than before. – Nathan Reed Oct 25 '11 at 18:21
  • It might not be a `O(n^2)` thing so much as a caching thing. If it can't hold it all in memory it sends some to disk... major slowdown, that doesn't come up when it's one eighth the size. – Mooing Duck Oct 25 '11 at 18:24
  • @MooingDuck: That would be `O(N^2)` though, if the number of disk accesses is `O(N)` and the quantity of data written is also `O(N)` – Ben Voigt Oct 25 '11 at 18:29
  • @MooingDuck: Good idea, but I looked at memory and I/O usage in task manager and I don't think it's a disc caching issue. I edited the question to include the additional info. – Nathan Reed Oct 25 '11 at 18:32
  • 1
    submit the file with a complaint to http://connect.microsoft.com/ – Mooing Duck Oct 25 '11 at 18:38
  • Can we see a small sample so we might speculate as to what went berserk? Or is that too confidential? – Mooing Duck Oct 25 '11 at 18:39
  • @MooingDuck: sure. I've added a code sample to the question. – Nathan Reed Oct 25 '11 at 18:56
  • @NathanReed: You wouldn't expect a 3,000 size array to slow compiles either. The idea is that it may go through a separate code path in the compiler ;) Like I said, it's a possible temporary workaround, not a real solution. – Joseph Garvin Oct 25 '11 at 19:17
  • Are you doing an optimised build? Might be worth turning off all optimisation on this file (it's not going to buy you anything anyway) in case it's the optimiser that is going N^2. – Alan Stokes Oct 25 '11 at 20:15
  • @AlanStokes OMG, that was it! I was sure I'd tried that already, but I must not have. With optimization disabled it's back down to ~15 secs to compile. :D If you want to post that as an answer, I'll accept it. – Nathan Reed Oct 25 '11 at 20:52

5 Answers5

6

Might be worth turning off all optimisation on this file (it's not going to buy you anything anyway) in case it's the optimiser that is going N^2.

Alan Stokes
  • 18,815
  • 3
  • 45
  • 64
  • I can't believe I didn't try this already, but with optimization turned off it's back down to ~15 seconds to compile. Thanks! – Nathan Reed Oct 25 '11 at 21:00
5

I had this very same problem. There was a const array of data that had approx 40'000 elements. Compile time was about 15 seconds. When I changed from "const uint8_t pData[] = { ... }" to "static const uint8_t pData[] = { ... }" compile time went down to less than 1 second.

Marco
  • 51
  • 1
  • 2
1

I've seen (can't remember where) a technique for converting large static data directly into object files. Your C++ code then declares the array as extern, and the linker matches the two together. That way the array data never undergoes a compilation step at all.

The Microsoft C/C++ tool CVTRES.exe worked on a similar principle, but it didn't generate symbols, but a separate resource section that needed special APIs to access (FindResource, LoadResource, LockResource).

Ahh, here's one of the tools I remembered finding: bin2coff The author has a whole bunch of related tools


Alternatively, you can try to reduce the dependencies, so that the particular source file never needs recompilation. Then minimal rebuild will automatically use the existing .obj file. Maybe even check that .obj file into source control.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • 1
    One technique might be to move it to it's own project (a static library) and only rebuild it when it changes. – Mooing Duck Oct 25 '11 at 18:23
  • The files are already not being rebuilt if they're unchanged. The trouble is, we actually do change the data in them pretty frequently. :/ – Nathan Reed Oct 25 '11 at 18:33
1

You can try turning off Pure MISL CLR support in the C/C++ settings. Worked for me.

0

Try to make your array static const, this reduced compile-time (but not filesize) in a similar case i have seen down to the intangible.

Janosch
  • 1,204
  • 1
  • 11
  • 19
  • It is already declared const. It can't be static because it needs to be visible to the rest of the project (external linkage). – Nathan Reed May 11 '12 at 16:18