program in C which reads data of sound and checks if the given data fulfills the prerequisites of a wav file

Question

The demand of the project is to write a program in C which reads data of sound only with getchar from the input according to wav format/template.
The program must check if the given data is correct. When the first mistake is detected an appropriate message appears and the program stops. The use of arrays, functions, pointers and mathematical libraries is forbidden. Also it checks if we have insufficient data or bad file size.

The wave file format:
http://tiny.systems/software/soundProgrammer/WavFormatDocs.pdf http://www.topherlee.com/software/pcm-tut-wavformat.html

example I thought this code for reading numbers:

while ( (ch = getchar()) != '\n') 
{
    if (ch >= '0' && ch <= '9')
    {
        bytes_sec = 10 * bytes_sec  + (ch - '0');
    }
}
fprintf(stderr, "bytes/sec:%d\n", bytes_sec);

but it isn't correct because we want the number to be saved as one byte.

and for characters I thought this code:

flag = 0;
fores = 0;

while ( ( (ch=getchar()) !='\n') && (flag==0)  ) {
    fores = fores + 1;
    if  (fores == 1) {
        if (ch != 'R')
        {
            flag = 1;
        }
    }
    else if (fores == 2) {
        if (ch != 'I')
            {
                flag = 1;
            }
    }
    else if (fores == 3) {
        if (ch!='F') {
                flag = 1;
            }
    }
    else {
        if ((fores != 4) || (ch != 'F')) {
            flag = 1;
        }
    }
}

if (flag == 1) {
    fprintf(stderr, "Error! \"RIFF\" not found\n");
    return 0;
}

Also I didn't understand in which form (binary, hex, decimal) the data is given (I know that data in wav format is in binary but I still can't understand what exactly is given as data and how - form and if it is given as a single piece or separately).
Finally the fact that we can not enter in the terminal data that do not correspond to printable characters really confuses me.

Let's say we have the contents of a legal wav file (hex):

0000000 52 49 46 46 85 00 00 00 57 41 56 45 66 6d 74 20
0000020 10 00 00 00 01 00 01 00 44 ac 00 00 88 58 01 00
0000040 02 00 10 00 64 61 74 61 58 00 00 00 00 00 d4 07
0000060 a1 0f 5e 17 04 1f 8a 26 ea 2d 1c 35 18 3c d7 42
0000100 54 49 86 4f 69 55 f6 5a 27 60 f8 64 63 69 64 6d
0000120 f7 70 18 74 c5 76 fa 78 b6 7a f6 7b b9 7c ff 7c
0000140 c8 7c 12 7c e0 7a 33 79 0b 77 6c 74 58 71 d1 6d
0000160 dd 69 7e 65 b8 60 92 5b 0f 56 36 50 0c 4a 97 43
0000200 df 3c ea 35 45 78 74 72 61 44 61 74 61
0000215

Ι thought to convert it from hex to decimal and then use a notepad to create a data file in order to use the command fopen(fname, "rb") but then I will have to use pointer which is forbidden. So I still haven't understand how the program will get prices/values at the input.

Also after the suggestions of @AndreasWenzel I came up with this code for little-endian (note: the condition in while is incorrect, it's been chosen in order to be easy to check in a C compiler):

#include <stdio.h>

int ch, sample_rate, help, fores, sum, i;
int main()
{
    fores = 0;
    sample_rate = 0;
    
    while ( (ch = getchar()) != '\n' )
    {
            help = (ch - '0');
            fprintf(stderr, "help:%d\n", help);
            if (help == 1)
            {
                sum = 1;
                for (i=1 ; i<=fores ; i++)
                {
                    sum = sum*2;
                }
                sample_rate = sample_rate + (sum);
                fprintf(stderr, "sample:%d\n", sample_rate);
            }
            fores = fores + 1;
    }
fprintf(stderr, "sample rate:%d\n", sample_rate);
}

If we have as input the 10100100(binary)=164(decimal) it will print 37(decimal). Is it correct?

Andreas Wenzel · Accepted Answer · 2021-01-19T05:06:23.590

1

See this link for an explanation on how integers are represented in memory when stored in binary. Note that for this task, you do not have to read about how signed integers are represented in memory (you only need to know about unsigned), and you also don't have to read about how floating-point numbers are represented in memory.

In a RIFF file (such as a .wav file), most data is not stored in ASCII, but in binary. Only very few things in a RIFF file are stored in ASCII. For example, the (sub-)chunk headers each contain a 4-byte ID, such as RIFF, fmt , which is stored in ASCII. But numbers will almost never be stored in the ASCII encoding of the characters '0' to '9', as your code seems to assume.

When reading a binary file, the while condition

(ch=getchar()) !='\n'

does not make sense. Such an expression only makes sense with line-delimited text input. In a binary file, the value '\n', which is the value of the ASCII encoding of the newline character, has no special meaning. It is just a byte value like any other, which may coincidentally occur in the binary file, for example in the binary representation of a number.

Therefore, in order to check the RIFF ChunkID, your while loop should instead always read exactly 4 bytes from the file. It should not continue reading until it finds a '\n', as it does now.

In order to read the subsequent ChunkSize value, which is 4 bytes long, you should read exactly 4 bytes again. You should also have a variable of type uint_least32_t, which is an unsigned integer and is guaranteed to be at least 32 bits in length. This ensures that the variable is sufficient in size to store the ChunkSize. You can then read one byte at a time using getchar and then calculate the whole number using the values of the individual bytes. See the above link about how integers are represented in computer memory, in order to understand how to calculate a larger number from individual byte values. In accordance with these rules on homework questions, I will not provide a solution to this problem, unless you specifically ask for it. Until then, I will only provide the following hints:

You must take into account that the number is stored in little-endian byte ordering.
When reading with getchar, you should store the result in an int or an unsigned char, not signed char or a char (which is signed on most platforms). Otherwise, the value of the variable will be negative, if the byte has a value larger than 127. Such negative values are harder to work with.

Note that if you are using #include <windows.h> (and you are on a platform which has this header), you can use the typedef DWORD instead of uint_least32_t. If you are using uint_least32_t, you will have to #include <stdint.h>.

EDIT: Now that you have solved the problem yourself, I will provide my own alternate solution. However, the task forbids the use of arrays, pointers and user-defined functions. On the other hand, in my opinion, all solutions to the problem which respect these restrictions will be messy and contain a large amount of unnecessary code duplication. Therefore, I have used them in my solution, which does not respect these restrictions:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <inttypes.h>
#include <limits.h>

//This will print an error message if you attempt to compile this 
//on a platform on which there are not 8 bits per byte.
#if CHAR_BIT != 8
#error "This program assumes 8 bits per byte."
#endif

void VerifyLabel( const char *label );
uint_fast32_t ReadNumber( int num_bytes, const char *fieldname );

int main( void )
{
    uint_fast32_t temp;

    VerifyLabel( "RIFF" );

    ReadNumber( 4, "ChunkSize" );

    VerifyLabel( "WAVE" );

    VerifyLabel( "fmt " );

    temp =
        ReadNumber( 4, "Subchunk1Size" );

    if ( temp != 16 )
    {
        fprintf( stderr, "Error: Expected Subchunk1Size to be 16!\n" );
        exit( EXIT_FAILURE );
    }

    temp =
        ReadNumber( 2, "AudioFormat" );

    if ( temp != 1 )
    {
        fprintf( stderr, "Error: Expected AudioFormat to be 1 (PCM)!\n" );
        exit( EXIT_FAILURE );
    }

    temp =
        ReadNumber( 2, "NumChannels" );

    if ( temp != 1 && temp != 2 )
    {
        fprintf( stderr, "Error: Expected NumChannels to be 1 (mono) or 2 (stereo)!\n" );
        exit( EXIT_FAILURE );
    }

    ReadNumber( 4, "SampleRate" );

    ReadNumber( 4, "ByteRate" );

    ReadNumber( 2, "BlockAlign" );

    temp =
        ReadNumber( 2, "BitePerSample" );

    if ( temp != 8 && temp != 16 )
    {
        fprintf( stderr, "Error: Expected BitsPerSample to be 8 or 16!\n" );
        exit( EXIT_FAILURE );
    }

    VerifyLabel( "data" );

    ReadNumber( 4, "Subchunk2Size" );

    return 0;
}

void VerifyLabel( const char *label )
{
    for ( const char *p = label; *p != '\0'; p++ )
    {
        int c;

        if ( (c = getchar()) == EOF )
        {
            fprintf( stderr, "input error while verifying \"%s\" label!\n", label );
            exit( EXIT_FAILURE );
        }

        if ( (uint_fast8_t)*p != c )
        {
            fprintf( stderr, "did not find \"%s\" label in expected position!\n", label );
            exit( EXIT_FAILURE );
        }
    }

    fprintf( stderr, "\"%s\" label OK\n", label );
}

uint_fast32_t ReadNumber( int num_bytes, const char *fieldname )
{
    int c;

    uint_fast32_t sum = 0;

    if ( num_bytes < 1 || num_bytes > 4 )
    {
        fprintf( stderr, "this function only supports reading between 1 and 4 bytes\n" );
        exit( EXIT_FAILURE );
    }

    for ( int i = 0; i < num_bytes; i++ )
    {
        if ( (c = getchar()) == EOF )
        {
            fprintf( stderr, "input error while reading field \"%s\"!\n", fieldname );
            exit( EXIT_FAILURE );
        }

        sum += (uint_fast8_t)c << 8 * i;
    }

    //On most platforms, unsigned int is 32-bits. On those platforms, the following
    //line is equivalent to:
    //fprintf( stderr, "%s: %u\n", fieldname, sum );
    fprintf( stderr, "%s: %" PRIuFAST32 "\n", fieldname, sum );

    return sum;
}

This is a sample output of my program:

"RIFF" label OK
ChunkSize: 88236
"WAVE" label OK
"fmt " label OK
Subchunk1Size: 16
AudioFormat: 1
NumChannels: 1
SampleRate: 44100
ByteRate: 44100
BlockAlign: 1
BitePerSample: 8
"data" label OK
Subchunk2Size: 88200

edited Jan 19 '21 at 05:06

answered Jan 10 '21 at 16:42

Andreas Wenzel

22,760
4
24
39

Thank you for your time and the assistance you offered me! Ιf you want to take a look ι have updated/edited the original post. If you feel that something is better or needed to be explained with code in my point of view is acceptable/welcomed. The part I have described is only the 1/7 of the whole project so a small explanation with code I don't think it would do any harm. Feel free to do whatever you think is more appropriate or productive. – W44 Jan 11 '21 at 10:04
@W44: In your update, you wrote: "note:the condition in while is incorrect it's been chosen in order to be easy to check in a c compiler" -- I don't understand what you intend to accomplish by continuing to use `'\n'`, because, as I have already explained in my answer, that value is meaningless in a binary file. – Andreas Wenzel Jan 12 '21 at 03:17
@W44: What is the point of `help=(ch - '0');`? That expression would only be meaningful when evaluating digits that are encoded in ASCII, where you have one digit per byte. However, in a RIFF file, most numbers are not represented in ASCII, but are stored in binary. In binary representation, one byte can represent a number between `0` and `255` (which is `0x00` to `0xFF` in hexadecimal representation). And 4 bytes, when used together to store a number, can represent a number between `0` and `4,294,967,295`. All of this is explained in the link that I posted. – Andreas Wenzel Jan 12 '21 at 03:27
@W44: Note that the hexadecimal view of the file that you posted is text, not binary. It shows the values of the individual bytes, so such a hexadecimal view is good for showing the contents of binary files. But you should not confuse such a hexadecimal text representation of a binary file with actual binary data. Parsing such a hexadecimal text representation is not the same thing as parsing actual binary data. – Andreas Wenzel Jan 12 '21 at 03:35
@W44: Also note that the program `notepad.exe` is unable to create binary files. It is intended for creating text files. In order to create binary files, you will have to use a so-called "hex editor" program (or create them with your own program). Some text editors have a "binary" or "hex" mode which allows you to perform edits in binary mode, but most don't (including notepad). – Andreas Wenzel Jan 12 '21 at 03:51
@W44: If you are using Microsoft Visual Studio (not to be confused with Visual Studio Code, which is a completely different product), then see [this link](https://stackoverflow.com/questions/1724586/can-i-hex-edit-a-file-in-visual-studio) on how to edit a file in binary mode. If you are on Linux, then I suggest you search for "hex editor linux" in Google. [This link](https://stackoverflow.com/questions/5498197/need-a-good-hex-editor-for-linux) may also be useful. – Andreas Wenzel Jan 12 '21 at 18:02
If a wav file contains the number 37 decimal (25 00 00 00),saved as 00100101 00000000 00000000 00000000 (because of little endian),then in the input buffer it will be added the char %? I am asking because if we have the command ch=getchar() and use the fprintf(stderr,"ch:%d\n",ch) I will get the number 37 only if I type in terminal the char %..." You can then read one byte at a time using getchar and then calculate the whole number using the values of the individual bytes ",could you please explain it more? By the way I am using Visual Studio Code (and rarely CLion) – W44 Jan 14 '21 at 17:18
@W44: `getchar` is equivalent to the function call `fgetc(stdin)`. It will read one character from standard input. By default, standard input is attached to the console/terminal, or whatever it is called in the operating system you are using. In this mode, it will accept text input from the user. It is not intended for entering binary data. However, it is possible to redirect standard input to take input from a binary file, for example by calling your program like this: `./myprogram – Andreas Wenzel Jan 14 '21 at 17:34
@W44: If you are looking for a Hex Editor (Binary Editor) for Visual Studio Code, you may want to read this: https://stackoverflow.com/questions/38905181/how-do-i-see-a-bin-file-in-a-hex-editor-in-visual-studio-code – Andreas Wenzel Jan 14 '21 at 17:37
@W44: In your example of the number 37 stored in 4 bytes little-endian, the first call to `getchar` will return an `int` with the value `37`, the second call, third call and forth call will return the value `0`. This is because it always reads one byte at once, even though the return type is `int` (which is normally 4 bytes). The reason why it returns `int` instead of `char` or `unsigned char` is so that it can also return the special value `EOF` to indicate end of file or an error condition, which is not representable as an `unsigned char`. – Andreas Wenzel Jan 14 '21 at 17:53
@W44: The ASCII code of the character `'%'` is 37. However, you should not be working with ASCII and should also not be using a console/terminal for standard input. You should be using a hex editor (binary editor) and binary files instead for this task. – Andreas Wenzel Jan 14 '21 at 19:56
I have finished the program.Could you please take a quick look? – W44 Jan 18 '21 at 20:13
@W44: Sure. You haven't posted it yet, have you? When you post it, please don't overwrite your question (which would invalidate my answer), but add it to the bottom of your question as an edit. Or better, you can post an answer to your own question (if it really is a proper answer). You may want to read this: [Can I answer my own question?](https://stackoverflow.com/help/self-answer) – Andreas Wenzel Jan 18 '21 at 20:20
@W44: I'm not sure if you have sufficient reputation to post an answer to your own question. According to the link I just posted, you may need 15 reputation for this. The information is unclear whether this limit only applies to posting the question and the corresponding answer together, or whether it is a general limit for posting answers to your own question. – Andreas Wenzel Jan 18 '21 at 20:31
I am sorry for asking but is it possible to send it to you as a private message? If not I will edit the main post. – W44 Jan 18 '21 at 20:43
@W44: You can also post the code on an external site such as [pastebin](https://pastebin.com/), and send me the link as a comment, if you prefer. – Andreas Wenzel Jan 18 '21 at 20:44
@W44: However, posting it as an update to your question has the advantage that I can also update my answer. This may be easier. – Andreas Wenzel Jan 18 '21 at 20:52
@W44: You seem to have password protected the paste? It is asking me for a password. Note: For security reasons, you should not post a password on this public site if you use it for any other accounts of yours. However, if you used this password only for this paste, then posting it should not be a problem. – Andreas Wenzel Jan 18 '21 at 20:55
@W44: Ok, it worked, I am looking at it. Since you set the expiration to "burn after read", the link you posted is no longer valid, so you may want to delete the comment, as it is of no interest to other people. EDIT: It seems you already did that. :-) – Andreas Wenzel Jan 18 '21 at 21:01
@W44: When you say that you have "finished" the program, are you saying that it works (or appears to work)? Have you tested it? – Andreas Wenzel Jan 18 '21 at 21:09
I think it works... I couldn't figure out how to redirect standard input so I haven't check it. With the logic that getchar returns the numeric value of the byte it reads, or a special code, represented by EOF I believe that it should work properly – W44 Jan 18 '21 at 21:12
I only have 40 minutes. Do you think I should make changes? – W44 Jan 18 '21 at 21:16
@W44: Oh, that is not long. I mainly wanted to improve your code style, but in that time, I guess we must concentrate on testing your program and fixing bugs. I will try to find a real `.wav` file and test your program on it. – Andreas Wenzel Jan 18 '21 at 21:19
@W44: Ok, thanks, I will test your program now. – Andreas Wenzel Jan 18 '21 at 21:23
@W44: I am still testing, I had to figure out how to redirect input on my debugger. – Andreas Wenzel Jan 18 '21 at 21:34
@W44: When I run your program on the file `8bitstereo4.wav` from the link you gave me, this is the output I get. It does not seem to be correct, but also not completely wrong: `size of file: 88236 size of format chunck: 16 WAVE type format: 1 mono/stereo:1 sample rate: 44100 bytes/sec:44100 bytes/sec:1 bits/sample: 8 Error! "DATA" not found` – Andreas Wenzel Jan 18 '21 at 21:44
If you could find the error it would be great. I only have 10 minutes... – W44 Jan 18 '21 at 21:48
@W44: Hmmm, the filename imples that it is stereo, not mono, but, as far as I can tell by looking at it with a hex editor, the header says it is mono. So your program seems to be correct. So the only problem seems to be that `DATA` is not being found. – Andreas Wenzel Jan 18 '21 at 21:50
I used the same pattern as for "RIFF","WAVE". Shouldn't be working? – W44 Jan 18 '21 at 21:53
@W44: I found the bug: You are looking for `DATA`, but you should look for `data` (case sensitive). – Andreas Wenzel Jan 18 '21 at 21:53
@W44: In other words, you should check for `'d`, `'a'`, `'t'`, `'a'` instead of `'D'`, `'A'`, `'T'`, `'A'`. – Andreas Wenzel Jan 18 '21 at 21:56
I was just reading again the info for wav file and changing the algorithm – W44 Jan 18 '21 at 21:57
@W44: I believe your program could still be improved, as it is a bit messy. However, we don't have time for that now. Your program seems to run correctly. However, with the restriction of not being allowed to use arrays, pointers and user-defined functions, I'm afraid that we can't improve the program much. Actually, I see no way of solving the problem in a non-messy way with the restrictions mentioned above. – Andreas Wenzel Jan 18 '21 at 21:59
@W44: If you want, I will later rewrite the program for you, to show you how the problem can be solved in a non-messy way, without these restrictions. – Andreas Wenzel Jan 18 '21 at 22:05
@W44: By the way, I believe I have found another mistake: This line is correct: `fprintf( stderr, "bytes/sec:%d\n", bytes_sec );`. But this line is wrong: `fprintf( stderr, "bytes/sec:%d\n", block_al );`. It seems you forgot to change the output string. – Andreas Wenzel Jan 19 '21 at 00:49
@W44: I have now added my own solution to the problem. However, I was unable to create an elegant solution to the problem which respected all of the restrictions mentioned in your question. Therefore, I did not respect the restrictions on user-defined functions, arrays and pointers. This solution has a lot less code duplication than your solution. – Andreas Wenzel Jan 19 '21 at 04:45
@W44: Since you have significant trouble with redirecting input in your compiler environment, you may want to take a look at [this link](https://github.com/microsoft/vscode/issues/20890). Unfortunately, I cannot help you much, because I have no experience with Visual Studio Code. – Andreas Wenzel Jan 19 '21 at 06:39
@W44: Instead of redirecting input, you could also open the file directly using `fopen`, and then use `fgetc` on that file instead of `getchar`. – Andreas Wenzel Jan 19 '21 at 06:41

program in C which reads data of sound and checks if the given data fulfills the prerequisites of a wav file

1 Answers1