8

I am working on a assignment where I am supposed to read a file and count the number of lines and at the same time count the words in it. I tried a combination of getline and strtok inside a while loop, which did not work.

file:example.txt (the file to be read).

Hi, hello what a pleasant surprise.
Welcome to this place.
May you have a pleasant stay here.
(3 lines, and some words).

Readfile.cpp

#include <iostream>
#include <fstream>
#include<string>
using namespace std;
int main()
{
  ifstream in("example.txt");
  int count = 0;

  if(!in)
  {
    cout << "Cannot open input file.\n";
    return 1;
  }

  char str[255];
  string tok;
  char * t2;

  while(in)
  {
    in.getline(str, 255);
    in>>tok;
    char *dup = strdup(tok.c_str());
    do 
    {
        t2 = strtok(dup," ");
    }while(t2 != NULL);
    cout<<t2<<endl;
    free (dup);
    count++;
  }
  in.close();
  cout<<count;
  return 0;
}
Erick Robertson
  • 32,125
  • 13
  • 69
  • 98
Rocco Lampone
  • 141
  • 1
  • 1
  • 4
  • You need to say more than "did not work". Tell us what error you get, or the SPECIFIC thing that your program does differently than you expect, then ask a specific question. We will not debug or rewrite your homework for you. – Blorgbeard Mar 16 '09 at 06:29
  • 17
    How about some of the examples from the following: http://www.codeproject.com/KB/recipes/Tokenizer.aspx They are very efficient and somewhat elegant. The String Toolkit Library makes complex string processing in C++ simple and easy. –  Dec 08 '10 at 05:26

6 Answers6

5

Just got this right!! Just removed all unnecessary code.

int main()
{    
    ifstream in("example.txt");
    int LineCount = 0;
    char* str = new char[500];

    while(in)
    {
        LineCount++;
        in.getline(str, 255);
        char * tempPtr = strtok(str," ");
        while(tempPtr)
        {
            AddWord(tempPtr, LineCount);
            tempPtr = strtok(NULL," ,.");
        }
    }
    in.close();
    delete [] str;
    cout<<"Total No of lines:"<<LineCount<<endl;
    showData();

    return 0;
}

BTW the original problem statement was to create a index program that would accept a user file and create an line-index of all words.

Yu Hao
  • 119,891
  • 44
  • 235
  • 294
Rocco Lampone
  • 141
  • 1
  • 1
  • 4
  • Please don't use strtok. It'll come back to bite you as soon as you need to write multi-threaded code. A good replacement with standard C++ is std::istringstream. – Tom Mar 18 '09 at 03:17
4

I have not tried compiling this, but here's an alternative that is nearly as simple as using Boost, but without the extra dependency.

#include <iostream>
#include <sstream>
#include <string>

int main() {
  std::string line;
  while (std::getline(std::cin, line)) {
    std::istringstream linestream(line);
    std::string word;
    while (linestream >> word) {
      std::cout << word << "\n";
    }
  }
  return 0;
 }
Tom
  • 10,689
  • 4
  • 41
  • 50
0
ifstream is {"my_file_path"}; 
vector<string> b {istream_iterator<string>{is},istream_iterator<string>{}};

Dont forget to include this:

<iterator>
0

Try moving your cout<<t2<<end; statement into your while loop.

That should make your code basically functional.

You may want to see this similar post for other approaches.

Community
  • 1
  • 1
Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
0

There are examples like this posted all over the internet. Here is a count-words program I wrote back when I was in high school. Use it as a starting point. Other things I would like to point out are:

std::stringstream :you std::getline the entire line, then use std::stringstream to chop it up into smaller pieces and tokenise it. You can get the entire line using std::getline and inputting it into a std::string, which you can then pass to std::stringstream.

Once again, this is only an example and won't do exactly what you want it to do, you will need to modify it yourself to make it do what you want it to do!

#include <iostream>
#include <map>
#include <string>
#include <cmath>
#include <fstream>

// Global variables
        std::map<std::string, int> wordcount;
        unsigned int numcount;

void addEntry (std::string &entry) {
        wordcount[entry]++;
        numcount++;
        return;
}


void returnCount () {
        double percentage = numcount * 0.01;
        percentage = floor(percentage + 0.5f);

        std::map<std::string, int>::iterator Iter;

        for (Iter = wordcount.begin(); Iter != wordcount.end(); ++Iter) {
                if ((*Iter).second > percentage) {
                        std::cout << (*Iter).first << " used " << (*Iter).second << " times" << std::endl;
                }
        }

}

int main(int argc, char *argv[]) {
        if (argc != 2) {
                std::cerr << "Please call the program like follows: \n\t" << argv[0] 
                        << " <file name>" << std::endl;
                return 1;
        }

        std::string data;

        std::ifstream fileRead;
        fileRead.open(argv[1]);
        while (fileRead >> data) {
                addEntry(data);
        }
        std::cout << "Total words in this file: " << numcount << std::endl;
        std::cout << "Words that are 1% of the file: " << std::endl;
        returnCount();
}
X-Istence
  • 16,324
  • 6
  • 57
  • 74
  • Hello, Thanks, Blorgbeard, Reed and X-Istence for the prompt replies. I need to not just parse the line, but also need to keep track of the lineNos. The problem statement is to make a list of words with the line-nos they appear on. – Rocco Lampone Mar 16 '09 at 06:47
  • Ravi: In which the code I just gave you will get you half way there. We are not here to do your homework for you! – X-Istence Mar 16 '09 at 06:53
  • Oh No! That was not my intention. I am having trouble with the just the first part. Once that is fixed I intend to do the rest on my own. – Rocco Lampone Mar 16 '09 at 06:59
0

If you can use boost libraries, I would suggest to use boost::tokenizer :

The boost Tokenizer package provides a flexible and easy to use way to break of a string or other character sequence into a series of tokens. Below is a simple example that will break up a phrase into words.

// simple_example_1.cpp
#include<iostream>
#include<boost/tokenizer.hpp>
#include<string>

int main(){
   using namespace std;
   using namespace boost;
   string s = "This is,  a test";
   tokenizer<> tok(s);
   for(tokenizer<>::iterator beg=tok.begin();beg!=tok.end();++beg){
       cout << *beg << "\n";
   }
}
Klaim
  • 67,274
  • 36
  • 133
  • 188