1

I'm writting a program to count the length of each word in array of characters. I was wondering if You guys could help me, because I'm struggling with it for at least two hours for now and i don't know how to do it properly. It should go like that:

(number of letters) - (number of words with this many letters)
2 - 1
3 - 4
5 - 1
etc.

char tab[1000];
int k = 0, x = 0;

printf("Enter text: ");
fgets(tab, 1000, stdin);

for (int i = 2; i < (int)strlen(tab); i++)
{


    for (int j = 0; j < (int)strlen(tab); j++)
    {
        if (tab[j] == '\0' || tab[j]=='\n')
            break;
        if (tab[j] == ' ')
            k = 0;
        else k++;

        if (k == i)
        {
            x++;
            k = 0;
        }
    }
    if (x != 0)
    {
        printf("%d - %d\n", i, x);
        x = 0;
        k = 0;
    }

}



return 0;
Swordfish
  • 12,971
  • 3
  • 21
  • 43
Kamil Lubecki
  • 21
  • 1
  • 2
  • Are you trying to count the number of letters in each word or the average number of letters in all words? – Fiddling Bits Nov 21 '18 at 18:51
  • 1
    @FiddlingBits Judging by the desired output, it's supposed to be number of words with N letters, where N ascends from 1. – WhozCraig Nov 21 '18 at 18:58
  • @WhozCraig Thank you for deciphering it. :-D – Fiddling Bits Nov 21 '18 at 19:01
  • Hi, I am trying to find how many words in the text given have for example 2 letters and so on. I seriously don't know how to do it – Kamil Lubecki Nov 21 '18 at 19:02
  • This is an algorithm problem. It should require *one* forward scan of the string, broken into loops that skip white space, then scan non-white space to gather the length of the next word. When the length is known, update a count table (initially filled with 0's) that is indexed by that length, and repeat the entire process. When done, you'll have a count table where all non-zero values indicate the number of words of the index length. Anyway, that's the algorithm to shoot for. – WhozCraig Nov 21 '18 at 19:14
  • 1
    _Side note:_ `strlen` has to scan the entire string each time it is invoked. This is very slow. The length of `tablen` doesn't change, so we can cache the length and call `strlen` only once. After the `fgets`, do: `int tablen = strlen(tab);` Then, change the `(int)strlen(tab)` calls in the `for` loops to be `tablen`. This will speed things up considerably. – Craig Estey Nov 21 '18 at 19:24

1 Answers1

0

By using two for loops, you're doing len**2 character scans. (e.g.) For a buffer of length 1000, instead of 1000 character comparisons, you're doing 1,000,000 comparisons.

This can be done in a single for loop if we use a word length histogram array.

The basic algorithm is the same as your inner loop.

When we have a non-space character, we increment a current length value. When we see a space, we increment the histogram cell (indexed by the length value) by 1. We then set the length value to 0.

Here's some code that works:

#include <stdio.h>

int
main(void)
{
    int hist[100] = { 0 };
    char buf[1000];
    char *bp;
    int chr;
    int curlen = 0;

    printf("Enter text: ");
    fflush(stdout);

    fgets(buf,sizeof(buf),stdin);
    bp = buf;

    for (chr = *bp++;  chr != 0;  chr = *bp++) {
        if (chr == '\n')
            break;

        // end of word -- increment the histogram cell
        if (chr == ' ') {
            hist[curlen] += 1;
            curlen = 0;
        }

        // got an alpha char -- increment the length of the word
        else
            curlen += 1;
    }

    // catch the final word on the line
    hist[curlen] += 1;

    for (curlen = 1;  curlen < sizeof(hist) / sizeof(hist[0]);  ++curlen) {
        int count = hist[curlen];
        if (count > 0)
            printf("%d - %d\n",curlen,count);
    }

    return 0;
}

UPDATE:

and i don't really understand pointers. Is there any simpler method to do this?

Pointers are a very important [essential] tool in the C arsenal, so I hope you get to them soon.

However, it is easy enough to convert the for loop (Removing the char *bp; and bp = buf;):

Change:

for (chr = *bp++;  chr != 0;  chr = *bp++) {

Into:

for (int bufidx = 0;  ;  ++bufidx) {
    chr = buf[bufidx];
    if (chr == 0)
        break;

The rest of the for loop remains the same.

Here's another loop [but, without optimization by the compiler] double fetches the char:

for (int bufidx = 0;  buf[bufidx] != 0;  ++bufidx) {
    chr = buf[bufidx];

Here is a single line version. Note this is not recommended practice because of the embedded assignment of chr inside the loop condition clause, but is for illustration purposes:

for (int bufidx = 0;  (chr = buf[bufidx]) != 0;  ++bufidx) {
Craig Estey
  • 30,627
  • 4
  • 24
  • 48
  • thanks a lot, but I'm a very beginner of a C language (and basically all of this), and i don't really understand pointers. Is there any simpler method to do this? I'm trying to learn topic by topic. Cheers :) – Kamil Lubecki Nov 21 '18 at 21:58
  • @KamilLubecki [What are the barriers to understanding pointers and what can be done to overcome them?](https://stackoverflow.com/q/5727/995714) – phuclv Nov 22 '18 at 01:40