4

I'm programming in C (Linux OS). I have to read a file, check for functions in that file and print the corresponding function name. So far, I have programmed to identify the functions using depth calculation of '{'. I know that __FUNCTION__ preprocessor directive is used for printing current file's function name. Similarly, is there any preprocessor directive for finding function names of the file we read? I'm not concerned about any specific tools. I want to get it programmed. Kindly guide me. Thanks in advance.

I have tried to implemented this code.This function takes the line( which is coming before '{') as argument.

void ffname(char line[100])
{
    int i,j,m,n,f=0;
    char dt[10],fname[28];
    char s[5][10]={"int","void","struct","char","float"};
    dt = strtok(line," ");
    for(i=0;i<5;i++)
    {
        m=strcmp(dt,s[i]);
        if(m==0)
        {
            f=1;
            n=strlen(dt);
        }
    }
    if(f)
    {
        for(i=n+2,j=0;i<strlen(line);i++,j++)
        {
            if(line[i] == '*')
                i++;
            while(line[i] != '(')
            {
                fname[j]=line[i];
            }  
        }
    }
}

I don't know that this code is correct. Shall i use in this way? is there any option to find the function name?

Dhasneem
  • 4,037
  • 4
  • 33
  • 47
  • The C preprocessor makes things considerably more complicated. You might want to call the preprocessor, and then examine the file produced after the preprocessor directives are removed. – ChuckCottrill Feb 12 '15 at 19:46
  • Coding standards can make this a much simpler task. merely require the braces for functions to be left justified, require indentation for everything else, and you have an easy way to detect functions -- the stuff that comes before the left-justified open-brace :-) Should you also require an empty line before the function type and signature, you can collect everything between function declaration and open-brace -- that is if you can assert coding standards. A pretty-printer can convert your existing code into standards conforming code. – ChuckCottrill Feb 12 '15 at 19:50

9 Answers9

3

I assume that the file you are reading is a C source file.

This is not a trivial task, if you want to do it properly (means, if you reliably want to recognize all functions). See Listing C/C++ functions (Code analysis in Unix) for some additional information.

I'm not concerned about any specific tools. I want to get it programmed.

That is certainly possible, but you will basically end up with a scanner/parser frontend for C, similar to what is already implemented in tools like Doxygen or Synopsis. You can probably simplify it a bit and use some heuristics, for example you do not need to parse the complete code (e.g. you can skip anything between { and }).

If you still want to implement your own approach, I would follow these steps:

  • In any case, you should run your C file through a C preprocessor first to resolve any macros and to have the raw C code available.
  • Then get familiar with basic Compiler Construction techniques, especially Scanning and Parsing your source file, and the C grammar. Note that there are different grammars, depending on the C version you are using. ISO/IEC 9899:TC2, Annex A1 contains a grammar for C99, for example. Looking at the source code of the mentioned tools should also help.
  • Implement a scanner to tokenize your input, and implement a parser which recognizes function names. From the grammar I mentioned before, (6.9.1) function-definition is the production term you should start with.
albert
  • 8,285
  • 3
  • 19
  • 32
Andreas Fester
  • 36,091
  • 7
  • 95
  • 123
3

I've used Simple C code to find the name of the function.

#include <stdio.h>
#include <string.h>

#define SIZE 1024
void ffname(char *line)
{
    int i=1,j=0;
    char *dt; 
    char name[SIZE];
    strtok(line,"("); 
    dt = strchr(line,' '); 
    if(dt[i] == '*')
        i++;
    while(dt[i] != '\0')
    {
        name[j]=dt[i];
        i++;
        j++;
    }
    name[j] ='\0';
    printf("Function name is: %s\n", name);
}

int main(int argc, char **argv)
{
    if(argc < 2)
    {
        printf("Give the filename \n");
        printf("Usage: %s filename\n", argv[0]);
        return -1;
    }
    int i, lines =0, funlines =0,count =0, fn =0, flag =0;
    char c[SIZE],b[SIZE];
    FILE *fd;
    fd = fopen(argv[1],"r");
    while(fgets(c,SIZE,fd))
    {   
        lines++;
        i=0;
        for(i=0;i<strlen(c);i++)
        {
            while( c[i] =='\t' || c[i] == ' ')
            {
                i++;
            }
            if( c[i] == '{')
            {
                count++;
                if(flag)
                {
                    funlines++;
                }
                if(count == 1)
                {
                    fn++;
                    printf("Function %d is Started..............\n", fn); 
                    flag = 1;
                    ffname(b);
                }
                break;
            }
            else if( c[i] == '}')
            {
                count--;
                if(!count)
                { 
                    flag = 0;
                    printf("No of lines in the function %d is: %d\n", fn, funlines);
                    printf("Function %d is finished..........\n", fn);
                    funlines = 0;
                }
                else
                {
                    funlines++;
                }
                break;
            }
            else if(flag)
            {
                funlines++;
                break;
            }
        }
        strcpy(b,c);
    }
    printf("Total no of function%d\n",fn);
    printf("Total no of lines%d\n",lines);
    return 0;
}
Jens
  • 69,818
  • 15
  • 125
  • 179
Dhasneem
  • 4,037
  • 4
  • 33
  • 47
1

This is very difficult to do correctly. Basically, you need to implement a c compiler to do this correctly. This is exactly what the c compiler does, and a proper grammar definition and preprocessor is required to do this.

xaxxon
  • 19,189
  • 5
  • 50
  • 80
1

It's difficult (not impossible, difficult) to write a parser for C, simply because C supports so many syntaxes.

You can define a function using

  1. Standard C style, with standard return types
  2. Standard C style, with typedef/enum etc return types (which cannot easily be identified with simple parser. You will need to build database of user defined data types in a file)
  3. C macro (Refer Basile's answer for example)
  4. Assembly (parse a very simple test.c through gcc -S to know the syntax) I have used this method to create some placeholder functions.

Hence, instead of parsing C file, you can more easily parse an assembly file.

E.g. gcc -S translates a C function definition as below:

    .globl  someFnName
    .type   someFnName, @function
someFnName:
    ...function-body related code...

If you ONLY want the list of the function names (i.e. no need for arguments/return value etc) you can easily parse the above 3 lines of code in assembly, compared to C file.
If you also add -g switch along with -s you would also get some line number information with it.

Advantages:

  1. Easier to parse than C file
  2. Takes care of most (if not all) methods to define a function.
  3. Based on ".globl someFnName" line present or not, you can isolate static functions.

Disadvantage:

  1. Requires external parser - gcc or some other
  2. compiler (gcc) dependent secondary parser required
  3. May give some false positives
Community
  • 1
  • 1
anishsane
  • 20,270
  • 5
  • 40
  • 73
1

I think flex and bison will help you to solve your problem, here is some links:c grammar(lex),c grammar(bison)

MYMNeo
  • 818
  • 5
  • 9
  • Building a lexical analyzer and parser is the best way, but requires a fair bit of knowledge. More details inserted here to explain how -- most specifically grammar annotations to emit the function name, return type, and signature, would be useful. – ChuckCottrill Feb 12 '15 at 19:54
1

Simple way, if you are willing to do some assumptions, read in the source code, then:

  • Remove any preprosessor directives (assuming you do not want functions from include files, and do not want to handle any wonky #define macros possibly related to functions, be careful about multiline #defines continued with \ at end of line).

  • Remove any comments (be careful about nested /* comments).

  • Convert any strings to "" (be careful about escaped \" and multi-line strings).

  • Convert any chars to ' ' or something (to get rid of '{' etc, be careful about escaped \' and also other escapes).

  • Convert all (nested, multiline) code blocks to "top level" {} pair.

  • Reformat the text to have line breaks only after ; and }, except join a lone ; in a line to previous line, in case it's actually part of }; which are not function definitions.

  • Remove any lines which end in ;

Unless I missed something, now you should be left with all the function definitions, one per line, with function body replaced with {}.

hyde
  • 60,639
  • 21
  • 115
  • 176
0

I think you can try regular expression to find if the target function name exists.

you can find more about regular expression in this post. Regular expressions in C: examples?

Community
  • 1
  • 1
Yuanhang Guo
  • 466
  • 4
  • 9
  • 1
    No, you cannot use regular expressions to find functions in an arbitrary c file. Regex's aren't good at finding matching open/close sigils. You'll need a grammar of some sort.. like a c compiler. – xaxxon Dec 17 '12 at 06:37
0

What kind of file do you read? Is it some arbitrary C source file? If it is, it could define functions in many different ways, e.g. by preprocessor macros. For example with

#define DF(Nam) void Nam##print(void) {puts(#Nam);}

a C file could have DF(foo) and have defined the function fooprint (without any occurrence of fooprint in the source code).

If you want to handle the set of functions names as seen by the compiler, better develop a compiler extension or plugin. With GCC, you could use MELT (a domain specific language to extend GCC) for that purpose.

If you want to find the [global] functions defined by some object file *.o, you could use the nm command on Linux. Perhaps also consider dlopen(3)-ing a shared object file *.so

Of course, all this may be compiler and system specific.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • In my case, I will be reading any kind of C files, which may or may not be defined as preprocessor macros. Is there any pseudo code for finding the function names? – Dhasneem Dec 17 '12 at 06:32
  • No.... because preprocessor tricks could do weird things, as I show. If you want to read any C file, you'll better extend the compiler processing it. – Basile Starynkevitch Dec 17 '12 at 06:33
0

If you can make use of gcc:

gcc -nostdinc -aux-info output demo.c

outputs only file functions (excluding standard libs)

NOTE: -nostdinc causes compile error

You can avoid compile error using sed

gcc -aux-info output demo.c
sed '/include/d' output
David Ranieri
  • 39,972
  • 7
  • 52
  • 94