1

I am trying to write write a program that reads a text file, counts each unique word, and then sorts the list of unique words and lists the number of occurrences of each word. However, I cannot seem to read in a single word from a string without messing up and reading in letters, numbers, and symbols. I've read other topics, but my logic is severely flawed in some way that I don't see.

int main()
{
 fstream fp;
 string line;

fp.open("syllabus.txt", ios::in);

getline(fp, line);

    string word = findWords(line);
    cout << word << endl;
}

string findWords(string &line)
{
int j = 0;
string word;

for(int i = 0; i < line.size(); i++)
{
    while(isalpha((unsigned char)line[j]) != 0 && isdigit((unsigned char)line[j]) != 1)
        j++;
    word += line.substr(0, j) + " + ";
    line  = line.substr(j, (line.size() - j));
}
return word;
}
skylerWithAnE
  • 102
  • 1
  • 9

3 Answers3

1
  1. You just read one line in your main but in question part you said you want to read the whole file

  2. Why you define findwords for taking address of string but give string ?

  3. i < line.size() your for condition case is wrong it is quite possible to exceed your string and get seg fault with this condition.

Berke Cagkan Toptas
  • 1,034
  • 3
  • 21
  • 31
  • 1. If I could get one line functional, I'd read the file line-by-line after that. It's just some weird parameter to the assignment. C++ actually has been allowing me to read by checking index of the string which is likely never useful in anything else ever again. – skylerWithAnE Nov 13 '13 at 01:39
1

There's lot's of things wrong with your chunk of code. For one you don't want to change line while you iterate through it. As a rule you shouldn't change what your iterating on. You want a start index and a end index (that you find from a search).

Here's a trick for you, you can read a single word with the >> operator

ifstream fp( "syllabus.txt" );
string word; 
vector<string> words;  

while (fp>> word)
    words.push_back(word);
Andreina
  • 63
  • 7
nykwil
  • 144
  • 4
1

This loop looks rather strange:

for(int i = 0; i < line.size(); i++)
{
    while(isalpha((unsigned char)line[j]) != 0 && isdigit((unsigned char)line[j]) != 1)
        j++;
    word += line.substr(0, j) + " + ";
    line  = line.substr(j, (line.size() - j));
}

Your "line" is being modified inside the loop but your "i" does not reset to the start of your new string when that happens. "i" is irrelevant in your loop anyway, it doesn't appear anywhere in it.

So why this loop?

As for the solution, there are multiple ways of doing it.

  • The simplest if you want to loop is to load the line into a string then use string::find_first_not_of where you have a string of all the alphabetic characters. That might not be the most efficient or even the most elegant. This returns a position, which will be std::string::npos for end of string or the position of the first non-alphabetic character.

  • The next simplest is a regular std::find algorithm which takes iterators and allows you to put in your own predicate, and you can put this base on not being alphabetic. Using C++11 it is easy enough to write a lambda based on isalpha (either the old C version or an enhanced C++ version using locale if your strings may contain characters outside the regular character set). This will return an iterator, either the end() of the string or the position of the first non-alphabetic character.

CashCow
  • 30,981
  • 5
  • 61
  • 92