9

Regarding following function, my debugger shows me __stack_chk_fail when finishing this function.

My system is Mac OS.

It is because my stack overflowed itself by checking references.

Also based on my experiment, if set vocab_size = 30000 it shows __stack_chk_fail error but when vocab_size = 20000 it is fine.

So I believe

vocab = (struct vocab_word *)malloc ((size_t) ((vocab_size + 1) * sizeof(struct vocab_word)));

is the issue. But malloc allocate memory on the heap rather than stack, so I am wondering where I goes wrong?

void populate_vocab(){
    FILE *fin;
    fin = fopen(word_file, "rb");
    vocab = (struct vocab_word *)malloc ((size_t) ((vocab_size + 1) * sizeof(struct vocab_word)));
    char word[MAX_STRING];
    int word_idx = 0;
    int num = 0;
    boolean word_mode = 1;
    long long cur_vocab_size = 0;

    while (!feof(fin)) {
        ch = fgetc(fin);

        if(ch == ' '){
            word_mode = 0;
        }else if(ch == '\n'){
            word_mode = 1;
            word[word_idx] = 0;
            vocab[cur_vocab_size].word = (char *)calloc(word_idx, sizeof(char));
            strcpy(vocab[cur_vocab_size].word,word);
            vocab[cur_vocab_size].cn = num;
            cur_vocab_size++;
            if (cur_vocab_size >= vocab_size){
                break;
            }
            //fresh var
            word_idx = 0;
            num = 0;

        }else{
            if(word_mode){
                word[word_idx] = ch;
                word_idx ++;
            }else{
                num = num * 10;
                num += ch - '0';
            }
        }
    }
    fclose(fin);
}
Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
Sanqiang Zhao
  • 287
  • 1
  • 2
  • 11
  • 3
    Please pick a language, either `C` or `C++`. If it's C++, scrap all of this and simply use `std::vector` along with `std::istringstream`. – PaulMcKenzie Jul 17 '16 at 05:19
  • 1
    You should read [Why is “while ( !feof (file) )” always wrong?](http://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong). – Some programmer dude Jul 17 '16 at 05:23
  • [Don't cast the result of `malloc`](http://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc) and [don't do `while(!feof(..))`](http://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong) – n. m. could be an AI Jul 17 '16 at 05:28
  • 1
    You never verify that word_idx < MAX_STRING. So this code can easily corrupt the stack frame and trigger this diagnostic. – Hans Passant Jul 17 '16 at 09:58
  • @HansPassant you are correct, one of my word exceed MAX_STRING. that causes stack overflowed. – Sanqiang Zhao Jul 17 '16 at 15:06

2 Answers2

11

Based on comments, I figured out the reason. One of words exceed MAX_STRING which cause stack overflowed.

Sanqiang Zhao
  • 287
  • 1
  • 2
  • 11
4

I recommend to run your crashing program under Valgrind or AddressSanitizer. On recent macOS, only AddressSanitizer is available.

Stacktrace after crash on __stack_chk_fail only tells you where the problem (stack overflow that smashed the stack) got detected. AddressSanitizer can tell you right when the overflow is happening.

To use AddressSanitizer, use recent clang or gcc and compile with flags

clang -fsanitize=address -fno-omit-frame-pointer -O1 -g hello.c

Report from AddressSanitizer can look like this

$ ../cmake-build-debug/cpp/examples/broker 
broker listening on 5672
=================================================================
==42793==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x70000c4bcbe0 at pc 0x0001013663e1 bp 0x70000c4bc830 sp 0x70000c4bc828
WRITE of size 8 at 0x70000c4bcbe0 thread T3
    #0 0x1013663e0 in pni_split_mechs sasl.c:443
    #1 0x1013646ea in pni_post_sasl_frame sasl.c:480
    #2 0x101357fad in pn_output_write_sasl sasl.c:677
    #3 0x101323909 in transport_produce transport.c:2751
    #4 0x10131ffd3 in pn_transport_pending transport.c:3030
    #5 0x1012b8755 in pn_connection_driver_write_buffer connection_driver.c:120
    #6 0x10120240f in leader_process_pconnection libuv.c:909
    #7 0x1011f8b48 in leader_lead_lh libuv.c:1008
    #8 0x1011f94f3 in pn_proactor_wait libuv.c:1062
    #9 0x10188c55d in proton::container::impl::thread() proactor_container_impl.cpp:753
    #10 0x1018bca31 in void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (proton::container::impl::*)(), proton::container::impl*> >(void*) thread:352
    #11 0x7fff6987f2ea in _pthread_body (libsystem_pthread.dylib:x86_64+0x32ea)
    #12 0x7fff69882248 in _pthread_start (libsystem_pthread.dylib:x86_64+0x6248)
    #13 0x7fff6987e40c in thread_start (libsystem_pthread.dylib:x86_64+0x240c)

Address 0x70000c4bcbe0 is located in stack of thread T3 at offset 192 in frame
    #0 0x101363ccf in pni_post_sasl_frame sasl.c:462

  This frame has 3 object(s):
    [32, 48) 'out' (line 464)
    [64, 192) 'mechs' (line 475) <== Memory access at offset 192 overflows this variable
    [224, 228) 'count' (line 478)
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
Thread T3 created by T0 here:
    #0 0x101f5dadd in wrap_pthread_create (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x56add)
    #1 0x1018bc4ab in std::__1::thread::thread<void (proton::container::impl::*)(), proton::container::impl*, void>(void (proton::container::impl::*&&)(), proton::container::impl*&&) thread:368
    #2 0x10188da97 in proton::container::impl::run(int) proactor_container_impl.cpp:802
    #3 0x100f0223c in main broker.cpp:427
    #4 0x7fff6968b3d4 in start (libdyld.dylib:x86_64+0x163d4)

SUMMARY: AddressSanitizer: stack-buffer-overflow sasl.c:443 in pni_split_mechs
Shadow bytes around the buggy address:
  0x1e0001897920: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1e0001897930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1e0001897940: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1e0001897950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1e0001897960: 00 00 00 00 f1 f1 f1 f1 00 00 f2 f2 00 00 00 00
=>0x1e0001897970: 00 00 00 00 00 00 00 00 00 00 00 00[f2]f2 f2 f2
  0x1e0001897980: 04 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  0x1e0001897990: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1e00018979a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1e00018979b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1e00018979c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==42793==ABORTING
Abort trap: 6
user7610
  • 25,267
  • 15
  • 124
  • 150