I am using the SLM toolkit by CMU-Cambridge for some baseline language modeling on language data, but when I run one of the built executables, my system detects a buffer overflow when it tries to execute one of the commands.
Based on this StackOverflow question I noticed that __gets_chk+0x179
caused the problem, and I've found two occurrences of gets/fgets
in the source code (evallm.c, also available in this GitHub project someone made) but I do not know how to fix them in a proper/secure way.
The relevant parts of the error message:
*** buffer overflow detected ***: /home/CMU-Cam_Toolkit_v2/bin/evallm terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(__gets_chk+0x179)[0x7f613bc719e9]
Aborted
The broken code
# declaration of input string variable
char input_string[500];
# occurence 1
...
while (fgets (wlist_entry, sizeof (wlist_entry),context_cues_fp)) { ... }
...
# occurence 2
...
while (!feof(stdin) && !told_to_quit) {
printf("evallm : ");
gets(input_string);
....
The error basically occurred when the input_string I gave to the evallm
command was too long. Normally it is to be called from the command line and you can interactively pass arguments. However, I piped all arguments together with the command (as seen in the example of the docs) but apparently sometimes my argument names where taking too much bytes. When I changed the array length of input_string from 500 to 2000 the problem was solved (so I guess the error was due to occurence 2). But I really would like to fix it by replacing gets()
by getline()
since it seems to be the right way to go. Or is replacing it by fgets()
also a solution? If so, what parameters should I use?
However, when trying to replace gets()
, I always get compiling errors. I'm not a C-programmer (Python, Java) and I'm not familiar with the syntax of getline()
, so I'm struggling to find the right parameters.