0

Basically, I have to show each word with their count but repeated words show up again in my program.

How do I remove them by using loops or should I use 2d arrays to store both the word and count?

#include <iostream>
#include <stdio.h>
#include <iomanip>
#include <cstring>
#include <conio.h>
#include <time.h>
using namespace std;

char* getstring();
void xyz(char*);
void tokenizing(char*);

int main()
{
    char* pa = getstring();
    xyz(pa);
    tokenizing(pa);

    _getch();
}

char* getstring()
{
    static char pa[100];
    cout << "Enter a paragraph: " << endl;
    cin.getline(pa, 1000, '#');

    return pa;
}
void xyz(char* pa)
{
    cout << pa << endl;
}
void tokenizing(char* pa)
{
    char sepa[] = " ,.\n\t";
    char* token;
    char* nexttoken;
    int size = strlen(pa);
    token = strtok_s(pa, sepa, &nexttoken);
    while (token != NULL) {
        int wordcount = 0;
        if (token != NULL) {
            int sizex = strlen(token);
            //char** fin;
            int j;
            for (int i = 0; i <= size; i++) {
                for (j = 0; j < sizex; j++) {
                    if (pa[i + j] != token[j]) {
                        break;
                    }
                }
                if (j == sizex) {
                    wordcount++;
                }
            }
            //for (int w = 0; w < size; w++)
            //fin[w] =  token;
            //cout << fin[w];

            cout << token;
            cout << " " << wordcount << "\n";
        }
        token = strtok_s(NULL, sepa, &nexttoken);
    }
}

This is the output I get:

enter image description here

I want to show, for example, the word "i" once with its count of 5, and then not show it again.

dspencer
  • 4,297
  • 4
  • 22
  • 43
  • 2
    Are you sure you mean to be writing a c++ program? If so, why are you using all those c string functions? This would be a couple of lines in c++. – cigien Jan 05 '21 at 14:04
  • Right. First, you have to make sure that you're actually writing C++ code, where this becomes mostly a nothing-burger that uses an associative container, and a few lines lines of parsing code. – Sam Varshavchik Jan 05 '21 at 14:06
  • @cigien idk we are leaning this way at university.we only have reached the topic strings as of now did pointers and functions – Salman Qurban Jan 05 '21 at 14:20
  • I'm very sorry to hear that that's the order you're being taught things. `std::string` is a really easy, and very useful tool to learn. Pointers are relatively advanced, and are used way less often in modern code. – cigien Jan 05 '21 at 14:27
  • Your comparison will be wrong for an input text like: 'aa aaaa aaaaaa aaa aaa#'. It will count 'aa' to be13 times. – ytlu Jan 05 '21 at 15:10
  • Therefore, you might want to save all tokens into an array of string, then try to 'strcmp' among this array. – ytlu Jan 05 '21 at 15:49

3 Answers3

1

First of all, since you are using c++, I would recommend you to split text in c++ way(some examples are here), and store every word in map or unordered_map. Example of my realization you can find here

But if you don't want to rewrite your code, you can simply add a variable that will indicate whether a copy of the word was found before or after the word position. If a copy was not found in front, then print your word

Deumaudit
  • 978
  • 7
  • 17
0

I read your last comment.

But I am very sorry, I do not know C. So, I will answer in C++.

But anyway, I will answer with the C++ standard approach. That is usually only 10 lines of code . . .

#include <iostream>
#include <algorithm>
#include <map>
#include <string>
#include <regex>

// Regex Helpers
// Regex to find a word
static const std::regex reWord{ R"(\w+)" };
// Result of search for one word in the string
static std::smatch smWord;

int main() {
    std::cout << "\nPlease enter text: \n";
    if (std::string line; std::getline(std::cin, line)) {

        // Words and its appearance count
        std::map<std::string, int> words{};

        // Count the words
        for (std::string s{ line }; std::regex_search(s, smWord, reWord); s = smWord.suffix())
            words[smWord[0]]++;

        // Show result
        for (const auto& [word, count] : words) std::cout << word << "\t\t--> " << count << '\n';
    }
    return 0;
}
cigien
  • 57,834
  • 11
  • 73
  • 112
A M
  • 14,694
  • 5
  • 19
  • 44
  • Your answer looks useful to me. Importantly, answers are meant to be useful to future visitors as well, not just the OP. – cigien Jan 05 '21 at 14:37
0

This post gives an example to save each word from your 'strtok' function into a vector of string. Then, use string.compare to have each word compared with word[0]. Those indexes match with word[0] are marked in an int array 'used'. The count of match equals to the number marks in the array used ('nused'). Those words of marked are then removed from the vector, and the remaining carries on to the next comparing process. The program ends when no word remained.

You may write a word comparing function to replace 'str.compare(str2)', if you prefer not to use std::vector and std::string.

#include <iostream>
#include <string>
#include <vector>
#include<iomanip>
#include<cstring>
 using namespace std;
      
 char* getstring();
 void xyz(char*);
 void tokenizing(char*);
 
 int main()
 {
    char* pa = getstring();
    xyz(pa);
    tokenizing(pa);
 }

 
char* getstring()
{
   static char pa[100] = "this is a test and is a test and is test.";
   return pa;
}
void xyz(char* pa)
{
  cout << pa << endl;
}
void tokenizing(char* pa)
{
   char sepa[] = " ,.\n\t";
   char* token;
   char* nexttoken;
   std::vector<std::string> word;
   int used[64];
   std::string tok;
   int nword = 0, nsize, nused;
   int size = strlen(pa);
   token = strtok_s(pa, sepa, &nexttoken);
   while (token)
   {
      word.push_back(token);
      ++nword;
      token = strtok_s(NULL, sepa, &nexttoken);
   }
   for (int i = 0; i<nword; i++) std::cout << word[i] << std::endl;
   std::cout << "total " << nword << " words.\n" << std::endl;
   nsize = nword;
   while (nsize > 0)
   {
       nused = 0;
       tok = word[0] ;
       used[nused++] = 0;
       for (int i=1; i<nsize; i++)
       {
           if ( tok.compare(word[i]) == 0 )
           {
              used[nused++] = i; }
       }
       std::cout  << tok << " : " << nused << std::endl;
       for (int i=nused-1; i>=0; --i)
       {
          for (int j=used[i]; j<(nsize+i-nused); j++) word[j] = word[j+1];
       }
       nsize -= nused;
   }
}

Notice that the removal of used words has to do in backward order. If you do it in sequential order, the marked indexes in the 'used' array will need to be changed. A running test:

$ ./a.out
this is a test and is a test and is test.
this
is
a
test
and
is
a
test
and
is
test
total 11 words.

this : 1
is : 3
a : 2
test : 3
and : 2
 
ytlu
  • 412
  • 4
  • 9