There are a large number of ways to approach the problem. You can use scanf
with character classes as shown, or you can use any other method to read the input (e.g. getchar
, fgets
, POSIX getline
) and then simply analyze the characters entered for anything other than "ACGT"
.
Additionally, in your problem statement you state "The user just keeps entering the sequence not more than 250 characters". Obviously you will need to loop to handle entry of multiple strings, but beyond that, you will also need to protect against and handle the remainder of any strings greater than 250
characters. It is unclear whether in that case, you want to keep the first 250 valid characters entered (seems logical you would), and then discard any over the 250 character limit.
Your tools for validating the input are, in the case of using character-oriented input (e.g. using getchar
) are simply to check each character input against your ACGT
. When using line-oriented input, (e.g. fgets
or getline
) the C library provides a number of tools to check for characters within a string, or substrings within a string (strchr
is applicable here) or you can simply walk a pointer down the entire string of input checking each character. (you also need to check for, and remove the '\n'
the line-oriented functions read and include in the buffer)
Putting the pieces together into a short example using fgets
for input and strchr
to check whether each character is one of ACGT
, you could do something like the following. Here the user can enter as many strings as desired. The program terminates when EOF
is read (manually generated with Ctrl + D on Linux, Ctrl + Z on windoze). In the event an invalid string is entered, the code identifies the position of the first invalid character in the entry:
#include <stdio.h>
#include <string.h>
#define MAXC 250
int main (void) {
char str[MAXC+1] = "", *valid = "ACGT";
printf ("Enter sequences [ctrl+d] to quit:\n");
while (fgets (str, MAXC+1, stdin)) /* read input */
{
size_t len = strlen (str), good = 1; /* get length, set flag */
char *p = str; /* pointer to str */
int c;
if (str[len-1] == '\n') /* trim '\n' char */
str[--len] = 0; /* overwrite with nul */
else /* line > 250, discard extra */
while ((c = getchar()) != '\n' && c != EOF) {}
for (; *p; p++) /* for each char in str */
if (!strchr (valid, *p)) { /* check against valid */
good = 0; /* not found - invalid */
break;
}
if (good)
printf ("VALID\n");
else
fprintf (stderr, "INVALID ('%c' at character '%ld'\n", *p, p - str);
}
return 0;
}
Example Use/Output
$ ./bin/acgtvalid
Enter sequences [ctrl+d] to quit:
ACGTTTGGCCCATTAGGC
VALID
ACCGGTTCCGGAITT
INVALID ('I' at character '12')