-2

Hope you all are doing well. I need a simple solution to my problem. I want to write a python code which takes C code as a string. Then regex like this r"#([^}]+)>|#([^}]+)\.h" will detect the headers files part (From # include to '>' or '.h'). I also wanted to extract both group1 and group2 regex part. Then python code will extract and remove that header part of the C code and save it in 'any' variable. Then I want remaining code to be extracted as any other variable.

For example str1=

#include <iostream>
#include <string>
#include conio.h` 

and str2=

void main (void)
{
int b = 32;
int a=34;
int wao= 35;
}

The input source code is:

#include <iostream>
#include <string>
#include conio.h

void main (void)
{
int b = 32;
int a=34;
int wao= 35;
}
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563

1 Answers1

1

First of all all include in C are going to be in a single line (check this) so you can instead use #(.*) (I've found it works better for this) and validate later. if you use the regex that you have in the post you are going to have trouble in the next step (try it your self)

if src_code is your input source code then

headers = list(re.finditer(r"#(.*)", src_code)) # Extract all the header matches

as a result

[<re.Match object; span=(1, 20), match='#include <iostream>'>,
 <re.Match object; span=(21, 38), match='#include <string>'>,
 <re.Match object; span=(39, 55), match='#include conio.h'>]

You can now get all the header strings in a list ( which you may verify if are valid if you want)

headers = [i.group() for i in headers] # get the match from above

And you can remove all the #include from the source code using re.sub

src_code = re.sub("#(.*)(\n+)", "", src_code) # Also remove any `\n` coming after

And there you have it

  • src_code
void main (void)
{
int b = 32;
int a=34;
int wao= 35;
}
  • for headers you can use "\n".join(headers) but note that they may be declared inside scopes aka brackets {} (functions, structs or raw etc)
['#include <iostream>', '#include <string>', '#include conio.h']
Countour-Integral
  • 1,138
  • 2
  • 7
  • 21