1

I am working on a project for a basic programming class and am not entirely familiar with strings and all. The goal of the program is to take one simple text file with a few random, repeated words, and return a file with each separate word and the number of times it appears in the file. The input file looks something like

Class
Text
Class
fall
mark
mark
Text

and the output should read,

Class 2
Text 2
fall 1
mark 2

I am having trouble reading and setting up an array for the input data. Not sure exactly how to set it up. Any suggestions would be great.

int main(void)
{
int k=0, p=0, words=0, match=0, ch=0;
double xx;
char I[WORD][LETTER; 

FILE *file;
file = fopen("C:\\Users\\Andrew\\Documents\\te2.txt", "r");

if(file != NULL){

// Count the amount of words in file set up array   
int i=0, j=0;
for(i=0; i<WORD; i++){
        for(j=0; j<LETTER; j++){
    fscanf(file, "%s", &I[i][j]);
    ch++;
     }}
  • Could you post the code you've got so far? – suroh Apr 27 '16 at 23:51
  • There are several ways one could solve this problem, but without knowing what your instructor expects you to be able to do, it's impossible to provide an answer that will not have a chance of getting you in trouble for cheating. It seems like the right answer will not include using an array all, but rather a `std::map`. – mah Apr 27 '16 at 23:52
  • That is what I have been trying to figure out. the way to do it without a std::map since that was not taught. It was not taught in the course however it is in the book, and if i can demonstrate knowledge of it then so be it – Andrew Thomas Apr 27 '16 at 23:55
  • Sorry about that, new to this website. The file is printed more as lines with one word in each of them. My thought process was to create an array of characters or strings and compare letter by letter, I believe i have the comparison portion down – Andrew Thomas Apr 28 '16 at 00:14
  • Several problems here, but what should be done is not all that obvious. One problem is you seem to not understand what `fscanf(..., "%s", ...)` does -- it does not scan in a letter, it scans in the full word. The actual solution to your problem will be more difficult than that though... and still I feel that how you go about it needs to conform to your instructor's expectations -- and nobody here knows what you are / what s/he feels you should know. I expect though that you need two arrays, one of words and another of counters (ints). – mah Apr 28 '16 at 00:16
  • Yes the original loop i had would scan in the full word, just the way i thought about it i seemed to want it in more of a character by character array, i tried %c but that was too unorganized, I either need a better way to store characters, which im not sure of, or my program for comparing the words should change. I am leaning towards the changing of the comparison now, just thought id see if this was possible first. your thoughts on this are appreciated by the way – Andrew Thomas Apr 28 '16 at 00:29
  • If i were to save the array of several words, is there a simple way to look at the individual characters inside those words, or do you think i should go another route? – Andrew Thomas Apr 28 '16 at 00:30
  • 1
    Implement in steps. First, read the input file, and output the words as you read them. that will prove you can read the input. Then you need to figure out the data structure you want to use to track the words. Only you (and your instructor) can say whether library data structures such as `std::map` or `std::string` are acceptable for use. Once you have the data structure(s) decided on (and implemented if you need to do that), it's more or less a simple matter of reading the input, updating the data structure, and outputting the results based on that data structure. – Michael Burr Apr 28 '16 at 00:37

2 Answers2

0

What you are trying to do is really quite simple. Lets break it down:

  1. Read in each word in a file.
  2. Store it in something that recognises individual words and can increment a number (a std::map).
  3. Print out the result.

Part 1 - Read each word

Lets read in a file word by word. Searching read in a file word by word c++ would have sufficed but here is a short example:

ifstream file; // A in file stream, the c++ way of reading files. 
file.open ("example.txt");
string word; // A word to store each word. 

// Read each word. 
while ( file >> word) {
    // Do something with your word. (Part 2)
}

Its as easy as that!


Part 2 - Store them in a map

Now we want to have a map, this is because a map will store only unique words and we can use its value to count the number of each unique word. Read about maps here. We need a string as the key and an int to count the words, so:

std::map<std::string, int> myWords;

To put them in, we need to check if the word already exists, otherwise we set it to one:

// See if the map already contains the word (not only is this easy, it is also very efficient!)
// If we dont contain the word then we get an iterator to past the end of the map. 
if (myWords.find(word) == myWords.end()) {
    myWords[word] = 1; // If our map doesn't already have this word 
                       // we have encountered our first!
} else {
    myWords[word]++; // The map already has the word so just increment it. 
} 

Part 3 - Print them out

The easy part, simply iterate over the map and print them out:

for (auto wordPair : myWords) {
    std::cout << wordPair.first << " " << wordPair.second << std::endl;
}

Put it all together

And you get this:

http://ideone.com/TCMOKh

Community
  • 1
  • 1
Fantastic Mr Fox
  • 32,495
  • 27
  • 95
  • 175
  • Why not just `while (in >> word) myWords[word]++;`? If `word` doesn't exist, `map` will create it, init it to 0, and the `++` will set it to 1 lickety-split. I can't think of a good reason to perform the look-up twice. – user4581301 Apr 28 '16 at 01:07
  • @user4581301 does map initialize the int to 0? Doesn't it just call the constructor? Ints are not initialized to 0 by default. – Fantastic Mr Fox Apr 28 '16 at 03:15
  • By default, no, and there isn't really a constructor for `int` but constructor-like behaviour has to exist or all the standard containers would require custom template specialization for every basic datatype or crap out as soon as they had to expand `TYPE()`. Playing around, in `int test = int();`, `test` is 0. For `int test = int(1);` `test` is 1. Can't cite the standard--have to buy me a copy some day--but too many things would break (or be a real pain to implement) if this bit of syntactic sugar was left out. – user4581301 Apr 28 '16 at 18:29
  • @user4581301 Well lets call mine the safe version for now. I will have a look around and see if i can find an answer for this. – Fantastic Mr Fox Apr 28 '16 at 18:36
  • Best I found while looking for some authority to back me, or cite the section of the standard is : http://stackoverflow.com/questions/5113365/do-built-in-types-have-default-constructors . Doesn't cover the inner workings of `map`, though. – user4581301 Apr 28 '16 at 18:43
0

So I have been working on it and I came up with the idea to have the data come in character by character withfscanf(file, "%c", &array[]);and i have been working on an algorithm to set a certain amount of the characters before a '\n' into a smaller array to use it tom compare to the full string. in the file i have added spaces to any word that was a few characters so that the "word sizes are all the same with the spaces, the arrays are successful just having some issues witht the algorithm itself. `

#include <stdio.h>
#include <cctype>
#define Words 9
#define Letters 6
#define CHARACT Words*Letters
#define CORRECT 3

int main(void){

double xx;
int match=0, indx=0, i=0, k=0, j=0, b, read=0, r, t, track=0;
char shrt[Letters],  Long[CHARACT], c;


FILE *file;
file =fopen("C:\\Users\\Andrew\\Documents\\te2.txt", "r");
if(file != NULL)
{
    // initialize long string (full file by characters)
    for(k=0; k<CHARACT; k++)
        {
        fscanf(file, "%c", &Long[k]);
        }

    // set up shrt[] to compare to rest of array
    while(track <= CHARACT)
    {
    indx+=track;
        //STEP 1: take shrt[] out of Long[] for comparisons
        for(r=0; r<Letters; r++)
            {
                shrt[r] = Long[indx];
                indx++;
                if(shrt[r] == '\n')
                    break;
            }

        for(t=0; t<CHARACT; t++)
        {
            match=0;
            int VV = track*Letters;
                // STEP 2: keep shrt[] constant, compare to full string one by one 
                for(int p=0; p<Letters-2; p++)
                {
                    if(shrt[p] == Long[VV])
                        {
                            match++;
                        }
                }
            if(match >= CORRECT)
            {
                for(int ee=0; ee<Letters; ee++)
                {
                    printf("%c", shrt[ee]);
                }
                printf("  %i", match);
            }
            track+=Letters;
        }

    }// big for loop




    // TESTING TESTING
    printf("\n\n\nTEST\n\n");
    for(k=0; k<CHARACT; k++)                
        printf("%c", Long[k]);              // delete when ready
    printf("%i", read);

}// big if
else{
    printf("error");
}
    fclose(file);
    scanf("%f", &xx);
return 0;
}`