Create a spell checker in C

Question

So I need to create a spell checker that takes an input file and checks it with a given dictionary file, and outputs the misspelled words. I have an idea of how to do it, but I get stuck where I need to compare the words in each file. I do not know how to compare one word of one file to all of the words in the other file. I was thinking I would use the strstr() function to do it, but again I'm stuck on how to actually implement it. Here is my code so far:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>

int main(int argc, char* argv[]) {
    FILE *inp = NULL;
    FILE *dic = NULL;
    inp = fopen(argv[1], "r");
    dic = fopen("american", "r");
    char *wordsString;
    char *dictionary;
    int inputStatus1, inputStatus2, i;
    inputStatus1 = fscanf(inp, %s, wordsString);
    inputStatus2 = fscanf(dic, %s, dictionary);
}

I would recommend against `strstr()`. Not only is it a nonstandard extension, but it would be very inefficient for the purposes of spell checking (especially with large dictionaries). You're probably going to want to either use a sorted list of dict words & binary search, or a hash table containing the dictionary. The sorted list is easier, but slower. — brenns10, Apr 07 '15 at 14:09
@brenns10 `strstr` is very much a C standard function and have always been so. — Lundin, Apr 07 '15 at 14:13
@brenns10 *Linux Programmer's Manual*: The `strstr()` function conforms to C89 and C99. — user12205, Apr 07 '15 at 14:13
@Lundin @ace My bad, I misread the GNU manpage, in which `strcasestr()` is a nonstandard extension. My point about efficiency still stands. — brenns10, Apr 07 '15 at 14:16
@bikerguy Note that you need double-quotes around `%s` (as in `"%s"`) in the `fscanf()` call. — user12205, Apr 07 '15 at 14:18

score 1 · Answer 1 · answered Apr 07 '15 at 14:13

1

You will need to import the file to Your program (if it's size is not too big), save it as You like it, maybe as an array of strings, where each string is a word in dictionary and then check given word in loop against any item in array until You find a match. This solution is very slow but it is the most basic solution I can think of. After You implement that try searching for a proper data structure to hold a dictionary that will allow much faster search and build the structure from the data in file. I would love to implement that for You but that is a good learning example where a lot of basic skills are required, try to search for solutions in courses on-line and if You cannot, come back to us!

answered Apr 07 '15 at 14:13

riodoro1

1,246
7
14

With a proper dictionary of any given language, such a crude search algorithm would take ages to finish. Your solution assumes that the OP is doing some artificial school work assignment with a limited amount of data (most likely the case, but anyway). – Lundin Apr 07 '15 at 14:16
I assumed that from the level of OP's progress and he way the question was asked. That would be a nice example to use hash table or a trie (much longer construction but future capability of auto-completion). – riodoro1 Apr 07 '15 at 14:20

score 0 · Answer 2 · edited May 23 '17 at 12:22

Assuming you want to use a real dictionary with all existing words in a given language, then the typical solution is to read the whole dictionary file and store it in a hash table. It has the advantage of fast, nearly deterministic table look-ups, even with huge amounts of data.

You have to come up with some clever hash function which is based on the ASCII letters on each word. I bet there is code for such functions out there on the web.

Most likely you need to implement the hash table by using dynamic memory allocation: large amounts of data should be placed on the heap and not the stack, because the process which is your program will have limited stack space.

Create a spell checker in C

2 Answers2