0

I've got a gcc 4.9.1 program using boost::asio (1.61), which valgrind 3.12.0 reports occasional errors on. I'm having a hard time understanding exactly what I'm looking at and hoping someone can help.

Valgrind traps this error:

==154023== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
==154023==    at 0x8FB3C6D: ??? (in /usr/lib64/libpthread-2.17.so)
==154023==    by 0x4F47287: boost::asio::detail::socket_ops::send(int, iovec const*, unsigned long, int, boost::system::error_code&) (socket_ops.ipp:1170)
  /* many levels elided */
==154023==  Address 0xe01cea4 is 9,396 bytes inside a block of size 524,304 alloc'd
==154023==    at 0x4C2AAFB: operator new[](unsigned long, std::nothrow_t const&) (vg_replace_malloc.c:466)

In the attached debugger, I look at the call into sendmsg:

(gdb) print msg
$1 = {
  msg_name = 0x0, 
  msg_namelen = 0, 
  msg_iov = 0x101c93c0, 
  msg_iovlen = 2, 
  msg_control = 0x0, 
  msg_controllen = 0, 
  msg_flags = 0
}
(gdb) print msg.msg_iov[0]
$2 = {
  iov_base = 0xe01cdf0, 
  iov_len = 198
}
(gdb) print &msg.msg_iov[0]
$3 = (iovec *) 0x101c93c0

If I am reading valgrind's complaint right, it thinks the memory at 0xe01cea4 is uninitialized. But it is. If I examine the 198 bytes of memory starting at 0xe01cdf0, I see my outgoing message exactly as expected. 0xe01cea4 is 180 bytes into that region, and has most definitely been initialized. Either I'm misunderstanding what valgrind is telling me, or this is a false positive. I'm inclined to believe it's my fault.

Furthermore, I'm wondering why valgrind is interested in that 0xe01cea4 address anyway. I can't find any other data structure using that value. This makes me think it's valgrind.

But lastly, this code cycles through many times, but only complains occasionally, which leads me back toward thinking there is something strange going on in my code after all.

Any light anyone can shed on what valgrind is telling me here would be appreciated.

John S
  • 3,035
  • 2
  • 18
  • 29
  • If you post the code that initialized this memory, then I might be able to give some more useful advice. – dbeer Apr 17 '18 at 19:34
  • 1
    Please [edit] your question to provide a [mcve]. – Baum mit Augen Apr 17 '18 at 19:34
  • I'm just asking for help with the message from valgrind. – John S Apr 17 '18 at 19:36
  • Is that the *full* output of Valgrind? We really need to see it all together with the MCVE that caused it. Hint: The error is *not* in either Boost or the system functions. – Some programmer dude Apr 17 '18 at 19:36
  • @Someprogrammerdude no, that is not the full output. In fact, it's so much clipped at top level that it doesn't say /anything/ anymore. – sehe Apr 17 '18 at 20:36
  • 2
    @JohnS See these posts for hints on how to read valgrind diagnostics, and you'll see why posting "just the tip" [sic] is not helpful: https://stackoverflow.com/questions/17899508/corrupted-double-linked-list/17899692#17899692, https://stackoverflow.com/questions/14952637/valgrind-conditional-jump-or-move-depends-on-uninitialised-values-does-this/14952667#14952667, https://stackoverflow.com/questions/12117720/reading-file-corrupted-data/12117852#12117852 – sehe Apr 17 '18 at 20:40
  • Also in case of uninitialised byte error, using --track-origins=yes might help to understand where the uninit byte is coming from – phd Apr 17 '18 at 21:08
  • Thank you @sehe. Those are useful links, the kind of thing I was looking for. – John S Apr 17 '18 at 21:35

1 Answers1

1

Thanks to some helpful links from @sehe and some further googling I figured out the problem. The problem is that while the buffer that sendmsg was reading was initialized properly, some of the data copied into it was not, under rare circumstances. I wasn't aware that valgrind was capable of that kind of recursive tracking.

For the record, neither the full stack trace on the way into the system call nor into the new showed where the underlying problem was. That buffer was filled on a different thread that had since finished executing. I suppose that might have been something I could find with --track-origins=yes but unfortunately valgrind spins up to 100% CPU and never terminates when I try it on this application.

John S
  • 3,035
  • 2
  • 18
  • 29
  • If using --track-origins=yes causes an infinite loop, then either it is a bug in valgrind or in your application. You might see if your application progresses using gdb+vgdb and/or file a bug on valgrind bugzilla – phd Apr 18 '18 at 22:05