3

I'm debugging a Qt program using strace, and the open() function shows that :

open("../libPlayCtrl.so", O_RDONLY|O_CLOEXEC)

in the cases it returns 3 it seems work, but when it returns 25 it does not, and the libPlayCtrl.so is not loaded.

What's the difference? And how can I fix it?

The .so file is a 3rd party lib. And not only this one, I also use other 3rd libs, and they are from the same vender. Some other lib files get their open(...) = 3, and they seem work fine.

  • Platform: Ubuntu 12.04 , 32bit.
  • Qt4.8
  • QtCreator 2.4.1
  • Compiler: GCC

EDIT:

below is part of strace output, due to I changed the configuration, the location of .so file is different. And the succeeded .so file is a newer version of the lib from the vender.

Success Case: 15 clauses in total before it finally found .so file.

open("../lib/tls/i686/sse2/cmov/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("../lib/tls/i686/sse2/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("../lib/tls/i686/cmov/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("../lib/tls/i686/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("../lib/tls/sse2/cmov/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("../lib/tls/sse2/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("../lib/tls/cmov/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("../lib/tls/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("../lib/i686/sse2/cmov/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("../lib/i686/sse2/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("../lib/i686/cmov/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("../lib/i686/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("../lib/sse2/cmov/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("../lib/sse2/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("../lib/cmov/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("../lib/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = 3

Fail Case: ellipsis indicates there are about 95 open() clauses that all equal to -1(not found). As you can see it turned out this time, it became 30 when it finally found the .so file.

And the program showed an error from the lib (maybe other): "Failed to load player SDK".

.....
21:02:33 open("./sse2/cmov/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
21:02:33 open("./sse2/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
21:02:33 open("./cmov/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
21:02:33 open("./libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
21:02:33 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 30
21:02:33 open("/.../.../.../RemoteClient/lib/libPlayCtrl.so", O_RDONLY|O_CLOEXEC) = 30`
phuclv
  • 37,963
  • 15
  • 156
  • 475
Henry
  • 2,819
  • 3
  • 18
  • 33
  • I think you should post your source code as it's difficult to speculate otherwise. When you get `3`, this means the `open` is done shortly after the program starts. When you get `25` that means fds 0-2 and 3-24 are occupied. In the latter case, what's being opened on these units [and why]? Also, posting some `strace` output for both cases might help as well. – Craig Estey Feb 24 '16 at 01:53
  • [correlation does not imply causation](https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation). If the return result from [open()](http://linux.die.net/man/2/open) is -1, then it means you had an error (and you get that with [`errno`](http://stackoverflow.com/questions/1546882/)). But if you get a positive integer back, the call succeeded and it's a "file handle"...essentially to be treated as a black box. The OS can theoretically give you any number it wants. Reduce your problem to a [MVCE](http://stackoverflow.com/help/mcve) that others can compile and see the same issue. – HostileFork says dont trust SE Feb 24 '16 at 01:55
  • *"due to I changed the configuration, the location of .so file is different. And the succeeded .so file is a newer version of the lib from the vender."* Um...okay, so why would you suspect the 30 vs. 3 has more to do with the problem than perhaps containing what could be completely different code? strace is not an effective debugging tool for this kind of problem, you'd need to really use a debugger, moreover different lib versions often (or really, usually) *cannot* work together. – HostileFork says dont trust SE Feb 24 '16 at 02:36
  • @HostileFork So the reason is the older .so file used to work in another similar example program (almost the same code in this specific part). And in my program, I use it. But after some alteration to both program(also ubuntu had some updates), not the lib, it did not work. Also, I looked at both manuals of new SDK and old SDK, didn't find alteration to this lib file. So I guess this old .so file should work. – Henry Feb 24 '16 at 02:45
  • @HostileFork I forgot exactly what I've done to both program, maybe to the configuration or so(But NOT the code). So put it simply, the reason why the older .so file failed, is maybe because I changed the configuration of the program, not the code. (And the new one works with same code). – Henry Feb 24 '16 at 02:52
  • @HostileFork But does the configuration affect the process after program already found the path to the lib? If yes, I don't know what to do to make a better .pro, or how to modify project configuration.... – Henry Feb 24 '16 at 02:56
  • @Henry It's okay to ask about the influence of configurations over things--hard things about different versions of libraries, or whatever. But you have to be very specific about the before and after. Create a sort of "lab" where you start from the beginning when everything works, make the changes in a controlled way. It has to be done so that others can follow along. Provide each step, each config, *minimally* to generate the problem. But the weirder what you do is--and the further it is from just "programming"--the less likely people will be able to help (or less likely to want to). – HostileFork says dont trust SE Feb 24 '16 at 02:57
  • @CraigEstey I checked fd 21, 22... at least these 2 fds haven't been assigned. fd 23 was occupied by opening a txt file. – Henry Feb 24 '16 at 03:04
  • @HostileFork Right now I'm gonna change the old to new one and try, your way is good but I don't have much time now. – Henry Feb 24 '16 at 03:07
  • @Henry If you create a scenario where others can see what configuration you have, then make that a new question. But for this question, it would be best to accept Sam's answer... good to keep the questions focused, one at a time, and finished up if they can be. – HostileFork says dont trust SE Feb 24 '16 at 03:16

1 Answers1

6

Go to the manual page for the open() system call, and you will find the explanation that the return value from open() is the number of the new, opened file descriptor.

After investing some more time in Google, you are certain to find an explanation that when a new file is opened, the kernel assigns the lowest available, unused, file descriptor to the opened file. That's all.

In conclusion, it doesn't matter whether open() returned 3 or 25, or 17, or 8. They all indicate that the file was opened successfully, and your suspicion that a different non-zero value indicates a problem of some kind is incorrect.

You may certainly have a problem of some kind with your application, but it does not have any direct relationship with this specific return value from open().

Now, the fact itself that sometimes you see open() returning 3, and sometimes 25 -- that indicates that there are several ways to reach this particular point in your application execution: with either no files open, other than the standard input, output, and error; or with at least 22 additional files, of some kind, opened already. It's entirely plausible that in the latter case your application did a lot more apparent work, involving opening 22 or more files before loading this library, and encountered a problem of some kind. But the actual problem itself has absolutely nothing to do with this specific system call.

Sam Varshavchik
  • 114,536
  • 5
  • 94
  • 148