1

I have a version file I need to parse to get certain versions in C99. For example purposes, say one of the strings looks like this:

FILE: EXAMPLE ABC123459876-001 REV 1.IMG

The 12345 numbers can be any arbitrary numbers, but always followed by 4 digits and a hyphen + a rev and an extension. I just want to return the middle of this string, that is, the file name + main version so: "EXAMPLE 9876-001 REV 1". I got it to work in the regex101 tester online with something like:

"(?<=EXAMPLE ABC.....)(....-... REV .)(?=.IMG)"

... but C99 regex does not support positive lookahead / lookbehind operators so this does not work for me. Should I be using strstr() or strtok() instead? Just looking for some ideas as to the best way to be doing this in C, thanks.

Jack
  • 361
  • 2
  • 4
  • 17
  • Number could be shorter or longer but always want to pull the last 4 + everything up to the ".". Plus the file name itself so "EXAMPLE". Trying to do this in a semi-clean manner. – Jack Nov 14 '17 at 23:48

3 Answers3

1

Do you really need regex for this? Could you not just split this string into substrings and work with that?

  1. You can remove the extension with finding the dot with strchr
  2. Substring the file name
  3. Use regex to get the rest with ([0-9]{4}.*$)
deiga
  • 1,587
  • 1
  • 13
  • 32
  • If that's the best way to go about it, I'll do that. Just looking for suggestions. Obviously string parsing in C is not the prettiest so was looking at a simple regex instead but doesn't look like it'll work with this old regex lib, especially just wanting to extract part of a match. – Jack Nov 14 '17 at 23:51
  • This seems to be the cleanest solution that gets it done in the easiest to read code that doesn't rely on a multiple group regex or memmoves / sscans. Thanks. – Jack Nov 15 '17 at 13:26
1

So you want everything except the File:-prefix and the file ending? Since File sounds static, this regex should work:

File: ([^\.]*)\..*

You can than get that group using regexec

SourceOverflow
  • 1,960
  • 1
  • 8
  • 22
  • Looks like this would be part of group 1 and not the full match. I guess that works but just requires additional code to grab that group. – Jack Nov 14 '17 at 23:56
  • Yes, the part that you want would then be group 1 – SourceOverflow Nov 15 '17 at 00:03
  • Seems this would return the entire middle of the string, that's not what I'm looking for. This returns "EXAMPLE ABC123459876-001 REV 1". I'm looking for something that would return "EXAMPLE 9876-001 REV 1". – Jack Nov 15 '17 at 00:40
  • 1
    Oh I'm sorry, I didn't notice the missing ABC. You could still make two groups. – SourceOverflow Nov 15 '17 at 08:01
0

Simplest way would probably be to use sscanf but it does risk buffer overflow (make sure your buffers are longer than the max file path length on the system and you should be fine).

Try something like this (code not tested):

int ret;
char sequence_num_prefix[ MAX_PATH_LEN + 1 ] = {0};
char sequence_num_postfix[ MAX_PATH_LEN + 1 ] = {0};
char version_num[ MAX_PATH_LEN + 1 ] = {0};
char my_name[ MAX_PATH_LEN + 1 ] = {0};

ret = sscanf( input_path_buf, "EXAMPLE ABC%[0-9]-%[0-9] REV %[0-9]", 
              sequence_num_prefix, sequence_num_postfix, version_num);

if( ret != 3 )
{
    //error
}

snprintf( my_name, sizeof( my_name ), "EXAMPLE %s-%s REV %s", 
          sequence_num_prefix, sequence_num_postfix, version_num );

Of course a safer way would be to use while loops, or, for cleanliness, use Bison.

Nathan Owen
  • 155
  • 10