1

Suppose there is an extensive C++ framework. This framework has multiple modules. I need to develop a tool that can scan all *.cpp, *.h, *.hpp files in the framework, isolate files relevant to one executable, and then remove all source code from those files that are not needed for compiling a specified executable.

For example, the framework has modules for querying relational databases, Document databases, Key-value stores, Column-oriented databases, and Graph databases.

I want to isolate only those files relevant to relational databases, and then I want to remove all those codes not used in relational databases.

Say the framework has a file format_string.hpp. This file contains 73 functions that can format strings in various ways. However, the relational database-related module uses only 15 functions from that file. So, I want to remove the rest of the 58 functions from that file.

Note: I often need to study C++ source codes and translate/convert that source code into other programming languages. This tool will help me understand the algorithms.

The deletion/purge is not permanent. It is only for learning purposes. This is about reducing the amount of source code for further manual inspection.

Question:

I have implemented the file-isolation part using C#.

However, I don't know how to implement the second part.

Do I need to use regex? Or do I need a lexer and a parser?

user366312
  • 16,949
  • 65
  • 235
  • 452

1 Answers1

4

This is, with some limitations, possible in C# using reflection, but for C++, you basically need to write a compiler for this. If the compiler is used with optimizations enabled, it does that for you and strips unused code, but that's the very last step of compilation (or rather, linkage). Doing this by pure textual analysis is almost impossible, since there are too many ways how code can be referenced even without literally mentioning it (e.g. macros or virtual functions).

Fareanor
  • 5,900
  • 2
  • 11
  • 37
PMF
  • 14,535
  • 3
  • 23
  • 49