I need to remove all occurrences of most common word from string in C.
If there are several words in the text that are repeated the same number of times, the function should remove the one of the most common words that is closest to the beginning of the string. When omitting words, you should not omit surrounding spaces and other characters. If the received string does not contain any words, the function does not need to do anything.
A word is defined as an array of uppercase and lowercase letters. The function does not need to distinguish between uppercase and lowercase letters
My algorithm is the following:
- find how many times the most common word appears in string
- then go word by word through string
- check if word occurrence is equal to occurrence of most common word
- remove found most common word
Code:
#include <stdio.h>
#include <limits.h>
#include <ctype.h>
int number_of_word_occurrence(char *s, char *start, char *end) {
int number = 0;
while (*s != '\0') {
char *p = start;
char *q = s;
while (p != end) {
if (*p != *q)break;
p++;
q++;
}
if (p == end)number++;
s++;
}
return number;
}
int length(char *s) {
char *p = s; int number = 0;
while (*p != '\0') {
p++;
number++;
}
return number;
}
char *remove_most_common(char *s) {
int n, max = INT_MIN;
char *p = s;
// Find max occurrence
while (*p != '\0') {
char *start = p;
int word_found = 0;
while (toupper(*p) >= 'A' && toupper(*p) <= 'Z' && *p != '\0') {
word_found = 1;
p++;
}
if (word_found) {
n = number_of_word_occurrence(s, start, p);
if (n > max)max = n;
}
p++;
}
p = s;
int len = length(s);
char *end = s + len;
int i;
// Removing most common word
while (p != end) {
char *start = p, *pp = p;
int word_found = 0;
while (toupper(*pp) >= 'A' && toupper(*pp) <= 'Z' && pp != end) {
word_found = 1;
pp++;
}
if (word_found) {
n = number_of_word_occurrence(s, start, pp);
// If word has max, then remove it
if (n == max) {
while (pp != end) {
*start = *pp;
start++;
pp++;
}
end -= max; // resize end of string
len-=max;
}
}
p++;
}
s[len+=2]='\0';
return s;
}
int main() {
char s[1000] = "Koristio sam auto-stop da dodjem do znaka stop ali prije stopa sam otvorio dekstop kompjutera stop";
printf("%s ", remove_most_common(s) );
return 0;
}
- words that should be removed are in bold
EXAMPLE 1: "Koristio sam auto-stop da dodjem do znaka stop ali prije stopa sam otvorio dekstop kompjutera stop"
OUTPUT: "Koristio sam auto- da dodjem do znaka ali prije stopa sam otvorio dekstop kompjutera "
EXAMPLE 2: " This is string. "
OUTPUT: " is string. "
EXAMPLE 3: "1PsT1 psT2 3Pst pstpst pst";
OUTPUT: " "11 2 3 pstpst "
EXAMPLE 4: "oneeebigggwwooorrrddd";
OUTPUT: ""
Could you help me to fix my code? I have some errors while removing characters. Also, could you help me to remove the word closest to the beginning if all of word occurrences are the same?
- Note: Functions from the
string.h
,stdlib.h
libraries, as well as thesprintf
andsscanf
functions from the stdio.h library are not allowed when solving the task. It is not allowed to create auxiliary strings in function or globally.