20

I'm currently working on large C++ Qt based project which is a about to go under a major re-factor of its public API and it would be nice to have a tool that can generate a report on which methods have been added or removed from build to build.

I know there is a tool for Java to do this and I think there might be one for .NET but I couldn't, after a bit of searching, find anything for C++.

Does one exist. Cross platform would be nice, or if only in Linux that would be fine too.

tshepang
  • 12,111
  • 21
  • 91
  • 136
Nathan W
  • 54,475
  • 27
  • 99
  • 146
  • 1
    Are you using any version control tool ? It may provide a diff tool that generate xml report ... – Thomas Vincent Jul 06 '11 at 13:01
  • Using Git for version control – Nathan W Jul 06 '11 at 13:32
  • 1
    I don't work with git but if you use what is shown in [this post](http://stackoverflow.com/questions/822811/differences-in-git-branches) on the *exposed* part of you API I think you can have a decent report of what changed between the two version of your API ... – Thomas Vincent Jul 06 '11 at 14:04
  • You may also want to mark old API as deprecated rather than removing it. That way compiler will issue a warning if such a method is still used. This is compiler dependent, of course, but `__attribute__((__deprecated__))` will do the trick for gcc. –  Jul 11 '11 at 16:35
  • Linked: http://stackoverflow.com/questions/1969916/static-analysis-tool-to-detect-abi-breaks-in-c and http://stackoverflow.com/questions/1970296/how-to-test-binary-compatibility-automaticaly – linuxbuild Aug 31 '11 at 11:43

7 Answers7

12

If you use Doxygen or some similar tool to document your API then you can diff the table-of-contents.

  • This is something you should be doing anyway.
    • (You can also tell Doxygen to find undocumented functions.)
  • You can apply it easily to ancient checkins without changing anything.
  • The Doxygen and its ilk know enough about the language to be sensitive to private and public.
  • This solution can be applied to many languages and is not dependent on a particular IDE.
  • No third-party software is required (given that you already have a documentation generator).
spraff
  • 32,570
  • 22
  • 121
  • 229
6

Check the bottom of the commercial list for apidiff, I think it'll be the closest match.

The suggestion of using 'nm' isn't a bad one, you can run

nm <binary_or_lib> | c++filt

And it'll generate a decent snapshot, that will need a fair amount of post-processing.

There's lots of ways to roll your own on this one:

  1. Doxygen can generate an XML file that has all the class / member / method information that you could then mine for building class trees. It would then be a matter of comparing trees. Some useful post-processing scripts / utilities can be found @ http://www.doxygen.nl/helpers.html

  2. If you're compiling with gcc, egypt is a novel approach that uses the intermediate RTL to produce call-dependency graphs - it seems like it wouldn't be that difficult to use a similar method to generate basic API information.

  3. GCC-XMLwill generate an XML representations of compiled code, a bit more low level than Doxygen as it provides a mechanism for writing wrapper code.

  4. cppHeaderParser, a python module will generate nice python object representations of headers giving an easy way to generate the API maps.

  5. ctags generates a tag database which could probably be processed. It has problems with C++ namespaces though.

Some commercial solutions

  1. scitool's Understand does a great job of mapping software out and has a perl API for querying its database.

  2. MagicDraw is kind of a heavy-weight tool centered around UML, but it can reverse-engineer an existing C++ code-base and generate meta-information.

  3. apidiff seems to be a pretty affordable tool and given the criteria (cross-platform, C++) is likely the closest match.

albert
  • 8,285
  • 3
  • 19
  • 32
synthesizerpatel
  • 27,321
  • 5
  • 74
  • 91
2

Instead of allowing all the visible symbols to export from your library automatically, you can use an explicit list of exported symbols. For larger libraries this is even recommended.

In Windows you use a .DEF file to export symbols from a DLL.

In Unix-likes you use a linker script to do it.

Community
  • 1
  • 1
Zan Lynx
  • 53,022
  • 10
  • 79
  • 131
  • Exported symbols don't advertise whether they are private or public. – spraff Jul 11 '11 at 16:49
  • @spraff: Yes but how is this relevant? Because the private symbols aren't part of the public API? – Zan Lynx Jul 11 '11 at 16:51
  • @Spraff: Actually, using the explicit list of exported symbols you would exclude all private symbols. No one outside the DLL or SO needs them. – Zan Lynx Jul 11 '11 at 17:09
  • I don't think that's neccesarily true. A class's definition could be across multiple object files (if, say, it inherits from a library parent class, then`protected` methods have to be exported), or it could have a `friend` function whose definition is in another object. Perhaps you and I mean slightly different things by 'exported'? – spraff Jul 12 '11 at 07:46
  • @spraff: Protected methods are part of the external interface because of inheritance. – Zan Lynx Jul 12 '11 at 13:42
  • ... and we want to *exclude* them from the results. Sorry to be picky but a symbol dump is simply not descriptive enough. – spraff Jul 12 '11 at 14:04
  • @spraff: We who? You aren't the question asker. The symbol dump *is* the public interface. Any symbols which are in there run the risk of being linked to, and future library changes that remove them will break code. – Zan Lynx Jul 12 '11 at 16:35
  • "We" as in everybody, including you. Protected methods are in the symbol dump but not the public interface. Sorry to shoot down your pet idea, but it's wrong. – spraff Jul 12 '11 at 16:40
  • @spraff: The public interface of a C++ library most definitely *does* include the protected symbols if there's any chance that library users will inherit from a library class and access those symbols. – Zan Lynx Jul 13 '11 at 13:41
  • No because those protected symbols are still not in the derived class's public interface. – spraff Jul 13 '11 at 13:49
  • @spraff: Ok, you go build a library and exclude the protected symbols from the export list. Let us know how it works out. – Zan Lynx Jul 13 '11 at 14:12
  • I'm not saying you don't need them, I'm saying they're not *public*. Protected is not public. How did this get so hard? – spraff Jul 13 '11 at 14:17
1

Add an automatic build step that uses nm on Unix-likes and whichever Windows tool (dumpbin?) to dump a list of exported functions. Use some scripting language to strip off unimportant bits that change from build to build, like addresses.

After each build commit this file to version control. Then you can see the differences from each build.

Because it is a C++ application, the names will also catch the parameter type changes.

Zan Lynx
  • 53,022
  • 10
  • 79
  • 131
1

Try abi-compliance-checker. This tool shows added/removed symbols in your API, changes in parameters/data types and other changes from the binary compatibility point of view. It's cross-platform. The best performance is on Linux, but it's able to run on Windows and Mac too.

Usage:

abi-compliance-checker -lib NAME -old OLD.abidump -new NEW.abidump

*.abidump files are ABI dumps generated by the abi-dumper tool.

This compatibility table for Qt library is created using this tool:

enter image description here

Feel free to ask any usage questions in the comments below.

linuxbuild
  • 15,843
  • 6
  • 60
  • 87
0

If you use git you should create a new branch and use a shell script to compare all header files that define the API bwetween the branches. If you have not yet done so, you should use the impl pattern for your API header files to make the library binary compatible/more stable for future versions. See the entry for d_pointer in the Qt developer wiki or the part about D-Pointers in the KDE techbase.

hmuelner
  • 8,093
  • 1
  • 28
  • 39
-1

In addition to the option of using Doxygen to roll your own analysis tool I would also suggest looking into using the BSC Toolkit. This allows you to access the code/class browser information generated by MS compilers and is available for free. The toolkit provides programmatic access to all definitions, usage references, source and line numbers, parameters, access modifiers, etc. Names are provided in their mangled form and facilities are included for translating them to human readable format if necessary.

Here is some very basic output from a project I am working on:

IXConnection (struct_name)
IXConnection::STATE (enum_name)
IXConnection::setState(enum STATE) (mem_func public)
IXConnection::setAccount(struct IXAccount *) (mem_func public)
IXConnection::setDisplayName(class String *) (mem_func public)
IXConnection::setProtocolData(void *) (mem_func public)
IXConnection::getState(enum STATE *) (mem_func public)
IXConnection::getAccount(struct IXAccount * *) (mem_func public)
IXConnection::getProtocol(struct IXProtocol * *) (mem_func public)
IXConnection::getPassword(class String * *) (mem_func public)
IXConnection::getDisplayName(class String * *) (mem_func public)
IXConnection::getProtocolData(void * *) (mem_func public)
IXConnection::setProgress(class String *,int,int) (mem_func public)
IXConnection::notice(class String *) (mem_func public)
IXConnection::error(enum REASON,class String *) (mem_func public)
Captain Obvlious
  • 19,754
  • 5
  • 44
  • 74