0

I have the following html code below.

<html>
<head><title>OPTIONS</title></head>
<body>
    <p>Choose schedule to generate:</p>
    <form action='cgi-bin/mp1b.cgi' method="get">
    <input type=checkbox value='tfield' name=on />Teacher<input type=text name="teacher" value=""/><br>
    <input type=checkbox value='sfield' name=on />Subject<input type=text name="subject" value=""/><br>
    <input type=checkbox value='rfield' name=on />Room<input type=text name="room" value=""/><br>
    <input type=submit value="Generate Schedule"/>
    </form>
</body>
</html>

And I have this CGI script written in C:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(void)
{
    char *data = malloc(1024);
    char teacher[1024];
    char subject[1024];
    char room[1024];
    printf("Content-type:text/html\n\n");
    printf("<html><body>");
    data = getenv("QUERY_STRING");
    if(data){
        sscanf(data,"teacher=%s&subject=%s&room=%s",teacher,subject,room);
        printf("%s,%s,%s",teacher,subject,room);
    }
    printf("</body></html>");
    return 0;
}

Whenever I click the Submit button, it outputs

(null),Ã…,œí

What's wrong with my code? Thanks!

EDITED: The code is edited but it outputs:

Smith&subject=Physics&room=Room,Xøm·l,

  • Try initializing your variables (especially the lead-char to `0` in the string buffers). Then try checking the results of the `sscanf()` you're invoking. It will tell you the number of formats *correctly* parsed. I think you may find that enlightening. Note where the commas are located in your output string. – WhozCraig Aug 18 '13 at 09:49
  • Yes. The result of `sscanf` is 1 instead of 3. Is there anything wrong with the code which `sscanf` treats it as a single string ? –  Aug 18 '13 at 09:56
  • Print the actual query string in your output, then update the question. There is no substitute for matching a format string against the *real* data it is supposed to tear apart than to have both viewable side-by-side. Or use a different parsing mechanic (such as `strtok()`). – WhozCraig Aug 18 '13 at 09:59
  • Yes, I can use `strtok()` and it works out well. But I still do want to find why this won't work using `sscanf`. –  Aug 18 '13 at 10:06

1 Answers1

0

You initially did not allocate memory for teacher, subject, and room. You corrected with fixed size blocks; this is a solution when you're 100% sure your strings can never get longer (also in weird error situations).

You don't need to allocate memory for data.

A char* is nothing more than a pointer to a string; it does not have any storage space for the actual chars.

You can do:

...
data    = getenv("QUERY_STRING");
teacher = malloc(strlen(data) + 1);
subject = malloc(strlen(data) + 1);
room    = malloc(strlen(data) + 1);

to get blocks of memory that will always be sufficient. Don't forget to free().

Also, check the return value of sscanf() it should be 3 in your case. In your case it returns 1 because scanf() sees Smith&subject=Physics&room=Room as the first string. So subject and room will contain random stack garbage, and that's what you're seeing after your first ,. The reason is that scanf() is a very simple parser; it starts looking for the next string when it has seen white space. Since your input does not contain white space, it simply consumes the whole remainder of URL parameter list as the first string. See below for a more advanced format that does work.

To parse your string have a look at strtok(). When using strtok() you must fist make a copy if the string returned by getenv(), because strtok() modifies the string, and you're not allowed to modify the string returned by getenv() (as can be read here).

Or while-loop over the string searching for &.

But you can also parse the URL parameters using sscanf() with [^&] as shown in example here.

On the other hand parsing URL's correctly is a lot of work because there may be escape sequences you may need to convert back to their appropriate characters. So, it would be better to use an existing library than to code this yourself.

Community
  • 1
  • 1
meaning-matters
  • 21,929
  • 10
  • 82
  • 142
  • I already allocated memory but still gives me the same output. Random characters, I think. –  Aug 18 '13 at 09:34
  • Yes, it returns `1` instead of `3`. No, we're not allowed to use existing libraries. We need to code this by ourselves. By the way, thanks for your answer. Really appreciate the help. –  Aug 18 '13 at 10:02
  • Yes, `strtok` works fine. I guess, I'll just have to use it. But I still want to find why `sscanf' won't work on this one. Thanks! –  Aug 18 '13 at 10:07