0

I am trying to calculate the number of sentences inside any text on the basis that the end of each sentence may be !, ? or ., but when I used strcmp() it doesn't work as expected. so if the text contains ! and compared with character constant ! it doesn't give the correct output as 0 as assumed.

Although, I tried to test the outputs to understand what took place led to such result but I couldn't understand so can anyone help ?

Thank you.

here is my code:

#include <cs50.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>

int count_sentences(string text);

int main(void)
{
    string text = get_string("text: ");
    //printf("%s\n", text);
    count_sentences(text);
    //printf("%i\n", strcmp("!", "!"));
}

int count_sentences(string text)
{
    int string_length = strlen(text);
    int num_of_sentences = 0;
    const char sent_ind1 = '?';
    const char sent_ind2 = '!';
    const char sent_ind3 = '.';
    //printf("%c %c %c", sent_ind1, sent_ind2,
    //sent_ind3);
    for (int i = 0; i < string_length; i++)
    {
        int value1 = strcmp(&text[i], &sent_ind1);
        int value2 = strcmp(&text[i], &sent_ind2);
        int value3 = strcmp(&text[i], &sent_ind3);
        if (value1 == 0 || value2 == 0 || value3 == 0)
        {
            num_of_sentences += 1;
        }
        //printf("1- %i 2- %i 3- %i i- %c c- %c si0 %c si1 %c si2 %c\n",
        //value1, value2, value3, i, text[i], sent_ind1, sent_ind2,
        //sent_ind3);
        //printf("1- %i 2- %i 3- %i i- %i\n",
        //sent_ind1, sent_ind2, sent_ind3, text[i]);
    }
    //printf("string length equal %i and number of sentences equal %i.\n",
    //string_length, num_of_sentences);
    return num_of_sentences;
}
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278

4 Answers4

2

These records

int value1 = strcmp(&text[i], &sent_ind1);
int value2 = strcmp(&text[i], &sent_ind2);
int value3 = strcmp(&text[i], &sent_ind3);

does not make a sense. For starters the second arguments of the calls of strcmp do not point to strings.

Secondly even if they would point to strings the result of the calls will be equal to 0 only in one case when these characters '!', '?' and '.' are the last characters of the string text.

Instead of the function strcmp use functions strcspn and strspn.

For example the function can look the following way

#include <stdio.h>
#include <string.h>

size_t count_sentences( const char *text )
{
    size_t n = 0;
    const char *end_of_sentence = "!?.";

    while ( ( text += strcspn( text, end_of_sentence ) ), *text != '\0' )
    {
        ++n;

        text += strspn( text, end_of_sentence );
    } 

    return n;
}

int main( void ) 
{
    const char *text = "Do you know C string functions? "
                       "Learn them!!! "
                       "They are useful.";

    printf( "%zu\n", count_sentences( text ) );                     
}

The program output is

3
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
2

If you simply want to count a number of '!', '?' and '.' in the string you need to compare characters.

size_t count_sentences(string text)
{
    size_t nos = 0;
    size_t pos = 0;
    while(text[pos])
    {
        if(text[pos] == '!' || text[pos] == '?' || text[pos] == '.') nos++;
        pos++;
    }
    return nos;
}

strcmp compares the whole strings not looking for the substring in the string. In your case, you do not pass as a second parameter the string only the reference to single char (and it is not a valid string). It is an UB.

0___________
  • 60,014
  • 4
  • 34
  • 74
1

In addition to properly comparing a char with a char of a string answered elsewhere, consider a different way to count sentences.

How many sentences in these 2 strings?

No end punctuation
Screaming text!!!  What???

To get 1 and 2 rather than 0 and 6. use ".?!" to enable an increment the next time a letter is seen.


size_t count_sentences1(const char *text) {
  // Best to use unsigned char for is...()
  const unsigned char *utext = (const unsigned char *) text; 
  size_t num_of_sentences = 0;
  int start_of_sentence = 1;

  while (*utext) {
    if (isalpha(*utext)) {
      num_of_sentences += start_of_sentence;
      start_of_sentence = 0;
    } else if (strchr(".?!", *utext)) {
      start_of_sentence = 1;
    }
    utext++;
  } 
  return num_of_sentences;
}
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
0

There are some 'clever' answers regarding "counting punctuation marks." Sadly, these would give an incorrect count when a sentence ends with an 'ellipse' ("...") or what some refer to as an "interobang" ("?!").

Without your CS50 library to test, I've a get_string() that returns a string complete with its trailing newline. This is 'optional' and needs to be adapted for your version of get_string().

// #include <cs50.h> // Don't have
#include <stdio.h>
#include <string.h>

int main() {
    char *foo = get_string( "Enter several sentences: " );

    foo[strlen(foo)-1] = '\0'; // Chop '\n' off if required

    int count = 0;
    while( strtok( foo, "?.!" ) )
        count++, foo = NULL;

    printf( "Number of sentences: %i.\n", count );

    return 0;
}

Output:

Enter several sentences: Does. this! fulfill the? requirement????
Number of sentences: 4.
Fe2O3
  • 6,077
  • 2
  • 4
  • 20
  • 1
    `foo[strlen(foo)-1] = '\0'; // Chop '\n' off if required` [Ugh, no.](https://stackoverflow.com/questions/2693776/removing-trailing-newline-character-from-fgets-input) What if `foo` is `""` (empty, zero-length string)? – Andrew Henle Aug 22 '22 at 00:21
  • @AndrewHenle You are right, of course. AND, passing NULL to `strtok()` would also invoke UB... This code was for illustration purposes only and not meant to be the final solution. (I'm curious how the CS50 library handles allocation of its buffers without memory leaking... The world is an interesting place.) :-) – Fe2O3 Aug 22 '22 at 00:27
  • @AndrewHenle btw: my `get_string` implementation ALWAYS returns at least a "\n".) :-) – Fe2O3 Aug 22 '22 at 00:33