1

I found this article on the efficiency of std::vector::push_back, the associated code can be found here. I tried it myself and I got an illegal instruction (core dumped), gdb indicates the error occurs on line 37.

I compiled using gcc 4.7.2, on a computer with:

$ sudo dmidecode -t processor
# dmidecode 2.11
SMBIOS 2.5 present.

Handle 0x0400, DMI type 4, 40 bytes
Processor Information
    Socket Designation: CPU
    Type: Central Processor
    Family: Core 2 Duo
    Manufacturer: Intel
    ID: 7A 06 01 00 FF FB EB BF
    Signature: Type 0, Family 6, Model 23, Stepping 10
    Flags:
        FPU (Floating-point unit on-chip)
        VME (Virtual mode extension)
        DE (Debugging extension)
        PSE (Page size extension)
        TSC (Time stamp counter)
        MSR (Model specific registers)
        PAE (Physical address extension)
        MCE (Machine check exception)
        CX8 (CMPXCHG8 instruction supported)
        APIC (On-chip APIC hardware supported)
        SEP (Fast system call)
        MTRR (Memory type range registers)
        PGE (Page global enable)
        MCA (Machine check architecture)
        CMOV (Conditional move instruction supported)
        PAT (Page attribute table)
        PSE-36 (36-bit page size extension)
        CLFSH (CLFLUSH instruction supported)
        DS (Debug store)
        ACPI (ACPI supported)
        MMX (MMX technology supported)
        FXSR (FXSAVE and FXSTOR instructions supported)
        SSE (Streaming SIMD extensions)
        SSE2 (Streaming SIMD extensions 2)
        SS (Self-snoop)
        HTT (Multi-threading)
        TM (Thermal monitor supported)
        PBE (Pending break enabled)
    Version: Not Specified
    Voltage: 1.2 V
    External Clock: 1333 MHz
    Max Speed: 5200 MHz
    Current Speed: 3000 MHz
    Status: Populated, Enabled
    Upgrade: Socket LGA775
    L1 Cache Handle: 0x0700
    L2 Cache Handle: 0x0701
    L3 Cache Handle: Not Provided
    Serial Number: Not Specified
    Asset Tag: Not Specified
    Part Number: Not Specified
    Core Count: 2
    Core Enabled: 2
    Thread Count: 2
    Characteristics:
        64-bit capable

What is the problem here? How can I get this code to work? I also tried with icpc 13.1.0 but this also failed.

Edit: I'm using Ubuntu 12.10 64-bit.

NPE
  • 486,780
  • 108
  • 951
  • 1,012
mkm
  • 673
  • 5
  • 21
  • Works fine for me with gcc 4.2 on a Core i7 - what OS are you using ? – Paul R Apr 12 '13 at 09:30
  • The error on line 37 doesn't mean the error is precisely there. You need to look at the disassembly and see which instruction exactly is causing it. – Alexey Frunze Apr 12 '13 at 09:32
  • @AlexeyFrunze thanks, so how do I do that? Also is it odd that the error is not reported in the `startRDTSC` function since that contains all the asm calls as `stopRDTSC` but is called first and seems to work. My entire stack trace is the following: `0 in stopRDTSCP of testvector.cpp:37 1 in CPUBenchmark::stop of testvector.cpp:55 2 in main of testvector.cpp:177` – mkm Apr 12 '13 at 09:41
  • Actually `stopRDTSC` has the line `"RDTSCP\n\t"` which is not in `startRDTSC`, perhaps that is the problem? It seems to works when I changed that to `"RDTSC\n\t"` as in `startRDTSC`. – mkm Apr 12 '13 at 09:42

1 Answers1

1

Your CPU doesn't support the RDTSCP instruction. It's a Core i7 instruction, and your processor is an earlier generation (Merom-L).

You should be able to use RDTSC instead. See, for example, Difference between rdtscp, rdtsc : memory and cpuid / rdtsc?

Community
  • 1
  • 1
NPE
  • 486,780
  • 108
  • 951
  • 1,012
  • Or the use of this instruction is disabled via `CR4.TSD`=0. – Alexey Frunze Apr 12 '13 at 09:46
  • @AlexeyFrunze: It's a Merom-L CPU, and predates Core i7. – NPE Apr 12 '13 at 09:48
  • @NPE Oh, I just tried this too and it seems to work. Thanks for the link, if I understand correctly it seems like changing to `RDTSC` can have a performance penalty, does that then spoil the usefulness of this code as a profiling tool? – mkm Apr 12 '13 at 09:51
  • Perhaps someone could try it modified and unmodified to see if it makes a difference for a compatible CPU, if not then perhaps it would be okay for my CPU also? – mkm Apr 12 '13 at 09:53
  • @mkm: As long as you measure a non-trivial amount of work, you'll be fine. – NPE Apr 12 '13 at 09:54