1

Using this solution with dirent.h, I'm trying to iterate on specific files of the current folder (those which have .wav extension and begin with 3 digits) with the following code :

(Important note: as I use MSVC++ 2010, it seems that I cannot use #include <regex>, and that I cannot use this as well because no C++11 support)

DIR *dir;
struct dirent *ent;
if ((dir = opendir (".")) != NULL) {
  while ((ent = readdir (dir)) != NULL) {
    printf ("%s\n", ent->d_name);
    //if substr(ent->d_name, 0, 3) ... // what to do here to 
                                      //  check if those 3 first char are digits?
    // int i = susbtr(ent->d_name, 0, 3).strtoi();        //  error here! how to parse 
                                                         // the 3 first char (digits) as int? 

    // if susbtr(ent->d_name, strlen(ent->d_name)-3) != "wav" // ...

  }
  closedir (dir);
} else {
  perror ("");
  return EXIT_FAILURE;
}

How to perform these tests with MSVC++2010 in which C+11 support is not fully present?

Community
  • 1
  • 1
Basj
  • 41,386
  • 99
  • 383
  • 673
  • what is `import `? and is this c++ because it looks like c to me. – Iharob Al Asimi Mar 01 '15 at 00:25
  • Sorry @iharob, I corrected, I meant : `#include ` (I wrote import probably because of python...) – Basj Mar 01 '15 at 00:28
  • If you are good at python then, why don't you use python? And also, a c solution would be very different from a c++ solution, and you don't need a regular expression for that it's like killing a fly with a bazooka. – Iharob Al Asimi Mar 01 '15 at 00:30
  • @iharob I have to do it on c/c++ and not python because I use a SDK only available in c/c++ and moreover I want a very small output executable... C or C++ solution would be fine for me, as long as it works with "old" MSVC++2010 – Basj Mar 01 '15 at 00:32
  • @iharob, why do you think this is not c++ like if you cannot help with c++. C++ was designed to be compatible and to allow C code to be compiled as c++ code. Your comments are clearly off-topic here. – Luis Colorado Mar 03 '15 at 12:03
  • @LuisColorado c++ has the stl and most c++ programmers would suggest stl based solutions or some kind of external library like boost, c solutions are simple and beautiful. – Iharob Al Asimi Mar 03 '15 at 12:42
  • c++ has also solutions simple and beatiful. Not knowing a language doesn't make it unattractive. Not using stl doesn't mean not using c++. – Luis Colorado Mar 04 '15 at 06:20

2 Answers2

3

You would not actually check for wav extension, merely that the filename would end with these 3 letters...

There is no such function as substr in the C library to extract a slice from a string.

You should check that the filename length is at least 7: strlen(ent->d_name) >= 7, then check that the first 3 characters are digits but not the fourth using the isdigit function from <ctype.h> and finally compare the last 4 characters of the filename to ".wav" using strcmp or better strcasecmp. The latter may be called _stricmp in the Microsoft world. If neither of these is available, use tolower to compare the last 3 characters to 'w', 'a' and 'v'.

Here is an implementation of the relaxed requirement (any number of digits):

#include <ctype.h>
#include <stdlib.h>

...

DIR *dir;
struct dirent *ent;
if ((dir = opendir(".")) != NULL) {
    while ((ent = readdir(dir)) != NULL) {
        char *name = ent->d_name;
        size_t length = strlen(name);
        if (length >= 5 &&
            isdigit(name[0]) &&
            name[length-4] == '.' &&
            tolower(name[length-3]) == 'w' &&
            tolower(name[length-2]) == 'a' &&
            tolower(name[length-1]) == 'v') {
               int num = atoi(name);
               printf("%s -> %d\n", name, num);
               /* do your thing */
        }
    }
    closedir(dir);
} else {
    perror ("");
    return EXIT_FAILURE;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • Nice solution! A last thing that was needed: how to extract the number (the 3 digits) as an `int` ? – Basj Mar 01 '15 at 01:16
  • 1
    `strtol(name, NULL, 10)` or simply `atoi(name)` will do. I'll edit the answer. – chqrlie Mar 01 '15 at 01:21
  • Just to understand fully how it works : How would do with your method @chqrlie for testing : 1) `any number (1 digit or 2 digits or 3 or 4 or ...)` + 2) `anystring empty or not` 3) `.wav`.... [Example : now `123blah.wav` , `123235762blah.wav` or `1blah.wav` should be accepted. It's just to understand how it would work with your method.] – Basj Mar 01 '15 at 02:02
  • You modify the code above to check for a minimum length of `5` and only test the first character with `isdigit`. Response edited. – chqrlie Mar 01 '15 at 02:06
  • If you do not care about case, you could further simplify using `memcmp(name + length - 4, ".wav", 4)` but be aware that filesystems are usually case insensitive in Windows. – chqrlie Mar 01 '15 at 02:09
  • Thanks! I would like to accept `.wav` but also `.WAV` but also `.Wav` and `.wAV`. Should I use `memcmp(name + length - 4, ".wav", 4)` ? – Basj Mar 01 '15 at 10:10
  • So you do want case insensitivity! Keep the version with `tolower` – chqrlie Mar 01 '15 at 10:15
0

Ok, this is a solution

#include <stdio.h>
#include <string.h>

int
main(void)
{
    const char *strings[] = {"123.wav", "1234 not-good.wav", "456.wav", "789.wav", "12 fail.wav"};
    int         i;
    int         number;

    for (i = 0 ; i < sizeof(strings) / sizeof(*strings) ; ++i)
    {
        size_t      length;
        int         count;
        const char *pointer;

        if ((sscanf(strings[i], "%d%n", &number, &count) != 1) || (count != 3))
            continue;
        length = strlen(strings[i]);
        if (length < 4)
            continue;
        pointer = strings[i] + length - 4;
        if (strncasecmp(pointer, ".wav", 4) != 0)
            continue;
        printf("%s matches and number is %d\n", strings[i], number);
    }
    return 0;
}

as you can see scanf() checks if there is an integer at the beginning of the string, skipping any possible white space characters too, and then captures then number of characters that were scanned, if that equals 3 then it proceeds to check the extension.

If you are using Windows replace strncasecmp() with _strnicmp().

Iharob Al Asimi
  • 52,653
  • 6
  • 59
  • 97
  • Waw, that's great! Last thing: I'd like to accept `123 bonjour.wav` (because .wav that begins with 3 digits) but not `12blah.wav` (2 digits only) nor `1234hey.wav` (4 digits)... Do you know how to do that? – Basj Mar 01 '15 at 00:40
  • `sscanf` is inappropriate for this. I am venturing a conjecture: *where there is a `scanf`, there is a bug* – chqrlie Mar 01 '15 at 00:48
  • @chqrlie you sure can misuse it. – Iharob Al Asimi Mar 01 '15 at 00:51
  • From my experience, most programmers misuse it. @Basj: what did you try and how does it *"work"*? – chqrlie Mar 01 '15 at 00:53
  • Why not simply `strtol(strings[i], &endptr, 10); if (endptr != strings[i] + 3) ...` – chqrlie Mar 01 '15 at 00:57
  • Just as a conclusion about `scanf`, how to test if it has the form : `number(any string, could be empty).wav` ? Is `%d%*.wav` correct ? (if I forget about the number of digit condition) – Basj Mar 01 '15 at 01:06
  • `sscanf` just does not do the job. `sscanf` is full of quirks and ends up being misused, especially by beginners who forget to check the return value, do not specify array sizes, use the wrong types, forget the ampersands, are confused about spaces and linefeeds... – chqrlie Mar 01 '15 at 01:13
  • Sorry, the specification did not say filenames couldn't have multiple extensions. `123.a.b.c.wav` should match the criteria. – chqrlie Mar 01 '15 at 01:15
  • @Basj: show how you use `sscanf`. It does not work the way you think nor does this format string make any sense. – chqrlie Mar 01 '15 at 01:19
  • @chqrlie what is a multiple extension? – Iharob Al Asimi Mar 01 '15 at 01:20
  • @Basj it matches because it always matches the integer in the first part, use the other approach I posted. – Iharob Al Asimi Mar 01 '15 at 01:21
  • @chqrlie Here it is : https://github.com/josephernest/EasyVolcaSample/blob/master/EasyVolcaSample.c#L386 As it seemed complicated, I removed the number of digits condition, and only tested : `number+anystring+.wav` – Basj Mar 01 '15 at 01:22
  • @Basj I was thinking about this, and there is a much better way, check the updated answer. – Iharob Al Asimi Mar 01 '15 at 01:29
  • @Basj: your format string is bogus. It does not check for .wav extension, `%*.wav` is a syntax error. All it does is check for an initial non empty string of digits. `123` will match, and so will `0.bak`. Even if you bend the rules, `sscanf` cannot do the job. – chqrlie Mar 01 '15 at 01:30
  • @chqrlie I think that this `scanf()` method that I posted is robust and simple, I just didn't remember the `"%n"` specifier when suddenly it came to my mind. – Iharob Al Asimi Mar 01 '15 at 01:34
  • It works but is less efficient than `number = strtol(strings[i], &pointer, 10); if (pointer != strings[i] + 3) continue;`. Encouraging the use of `sscanf` is not good advice. Furthermore, the `%n` specifier is considered risky and disabled in `sscanf_s`. – chqrlie Mar 01 '15 at 01:37
  • @chqrlie ok I'll modify with your answer (`strtol` etc.), but just for my pure knowledge of `scanf` / `printf`, what would be a correct *format string* for any `"nonempty number"+"anystring empty or not"+".wav"` ... How to correct the `%d%*.wav` ? – Basj Mar 01 '15 at 01:39
  • @Basj when you asked I was thinking of it and realized that there is no proper format string for `scanf()` to work because it's not for regular expression matching, so there is none. – Iharob Al Asimi Mar 01 '15 at 01:41
  • I cannot think of a format string for `sscanf` that would achieve this. `"%d%*[^.].wav"` would definitely not work: it would not even match the .wav extension. `sscanf` is useless for most careful parsing tasks and highly error prone. – chqrlie Mar 01 '15 at 01:43
  • 1
    @chqrlie I can see that you have some special feelings towards `sscanf()`. I think it's not good to be irrational about this kind of thing, it's like those who think that `goto` should never be used. Your code is evidently correct but that doesn't mean that you can't write correct code using `sscanf()`, I have seen people complain about a lot of standard functions but never `scanf()`, it does work if it's used correctly, using it correctly seems hard, but not impossible. – Iharob Al Asimi Mar 01 '15 at 01:48
  • @iharob: I beg to differ: I have special feelings for C programmers, especially for beginners that take the hard path to programming by learning C instead of more "modern" "safer" languages. I am just trying to warn them about some of the pitfalls of the standard library. `scanf` and friends are really no friends at all! The same goes of course for `gets`, but for `strncpy` as well. – chqrlie Mar 01 '15 at 01:55
  • @chqrlie When a beginner learns a modern language like you call it, then they fail to understand the most basic and important concepts of programming, I encourage new programmers to learn c first, if they'd like then c++ and then some scripting languages and Java, also you should consider that in the scientific world normally one ends up needing c for performance reasons. But we should not discuss this here of course. And it's just my opinion and might very well be wrong. – Iharob Al Asimi Mar 01 '15 at 02:58
  • 1
    @iharob: to close this debate, I quite agree with you, except maybe for the c++ part. – chqrlie Mar 01 '15 at 03:03
  • @iharob: It took me a while, but even for the simple 3 digit test, `scanf` does not do the job: `sscanf(strings[i], "%d%n", &number, &count) == 1) && (count == 3)` will accept " 12.wav" and "-12.wav" and similar patterns that fall outside of the specification. `scanf` is definitely not the right tool for pattern matching. – chqrlie Mar 04 '15 at 08:48