1

I am coding a program which takes a text file as an input, makes the index of the words of it and prints the output(the index) in a file and in the screen.

the input file may be huge. but we KNOW that the maximum variety of the words used in the text file is 200. we don't know what's the maximum of lines and characters of each word. so I should reserve a large number for them. I took the maximum of line 1000 and the maximum characters of each word 100.

I am programming in Turbo C and (I am forced to use that). the compiler allocates just 64kb memory (with the size of the compiler included) and so I have to use MALLOC.

my program is supposed to work in this algorithm: it reads the input file line by line with fgets. then in the current line, it reads word by word with strtok. so far I have the xth word in yth line. I want to put the words in an array of pointers. so I need a char * word[200]. and I want to show how many times, which word is repeated in which line. so I need a int index [200][1000]. if in yth line, the xth word is existed I would do index[x][y]++.

so now I need to allocate MALLOC memory to these char * word[200] and int index[200][1000] . Can anyone help? I tried all the answers to these question and none of them helped.

Melika Barzegaran
  • 429
  • 2
  • 9
  • 25
  • 2
    As to the error, move the declaration out of the loop or move it inside a block. Also, [do **NOT** cast the return value of `malloc()`](http://stackoverflow.com/questions/605845/do-i-cast-the-result-of-malloc/605858#605858). –  Dec 02 '13 at 18:24
  • It's quite too obvious that it doesn't work.and for sure I know that. I just coded that so that the readers understand what I mean. – Melika Barzegaran Dec 02 '13 at 18:25

3 Answers3

2

You don't quite have malloc right. Your malloc(100) is only allocating 100 bytes. You need

char * words[i] = malloc(sizeof(char *) * 100);

This allocates 800 bytes (100 elements of 8 bytes (size of a pointer) each).

Similarly, in the second malloc, you want two integers, you need

int index[i][j] = malloc(sizeof(int *) * 2);

You shouldn't cast to a pointer; it returns a void pointer, which is implicitly cast to whatever type of pointer you need just by virtue of the assignment.

http://www.cplusplus.com/reference/cstdlib/malloc/

FURTHERMORE:

Additionally, you're trying to stuff 2 bytes into an integer pointer or 4 bytes (100-96 = 4; the 96 is 8 * 12) into a character pointer. I have no idea in the world what that will do. The BEST you can hope for is that you'll just lose the memory somewhere and effectively have 12 character pointers and 2 memory leaks.

ciphermagi
  • 747
  • 3
  • 14
1

If I understand you

In the first loop, I want to define an array of 200 pointers that each pointer, points to an array of char blocks. I want each pointer, points to an array of maximum 100 bytes. Meaning 100 char blocks

char **words = NULL;
int i;
words = malloc(sizeof(char*) * 200);
for(i = 0; i < 200; i++) {
    words[i] = malloc(100);
}

There's allocating 200 words with 100 bytes size here.

In the second loop, I want to define a 2D array of int blocks that each block is maximum 2 bytes. meaning 200 * 1000 int blocks.

int **index = NULL;
int i;

index = malloc(sizeof(int*) * 200)
for (i = 0; i < 200; i++) {
    index[i] = malloc(sizeof(int) * 1000);
}

Here you allocate 200x1000 int array.

Deck
  • 1,969
  • 4
  • 20
  • 41
  • Yes. `index` is 2d array. You can access to any element (which has `int` type) by `index[i][j]` – Deck Dec 02 '13 at 18:38
  • and I want to use the array of word and and the 2D array of index at the same time. we are addressing the both of them in first lines to NULL. it would make trouble? – Melika Barzegaran Dec 02 '13 at 18:49
  • @MelikaBarzegaranHosseini Do you want to use an array of "words" with size 200x100 (200 words per 100 character) and 2D array of ints (200 x 1000 ints) so that each `int` points to each character from first array? – Deck Dec 02 '13 at 19:09
  • no they are not related. I am coding a program which has a file as an input, creates the index of it and prints the output in the screen and in a output file. I need words array to hold words of the input file in it. and I need the index 2D array to hold which word, how many times is repeated in which line. 200 words * 1000 sentences. each block in this 2D array will show how many times this word is repeated in this sentence. – Melika Barzegaran Dec 02 '13 at 19:14
  • You seem to not store sentences. You want to ouptut "word XYZ is cited 2 times in sentences 42 and 1 time in sentence 512" ? – manuell Dec 02 '13 at 19:20
  • what? no I don't store sentences. I read line by line with fgets from file, then read word by word in that line with strtok and create the index. – Melika Barzegaran Dec 02 '13 at 19:30
0
char * words = malloc(200*100);
int * index = malloc(200*1000*sizeof(int));

// word[i*200+j] : character j in word i
// index[i*200+j] : int at index i,j

alternatives:

// mallocing an array for storing a maximum of 200 malloced words
char ** words = malloc(200*sizeof(char*));
// adding a new word, at index i, which is pointed to by pszNewWord (null terminated)
words[i] = strdup(pszNewWord); 
manuell
  • 7,528
  • 5
  • 31
  • 58
  • so by coding such as above, we don't have access to index as a 2D array? – Melika Barzegaran Dec 02 '13 at 18:53
  • you know it doesn't help as I don't know how much characters each word contains. just the maximum is 200. – Melika Barzegaran Dec 02 '13 at 18:56
  • No, we don't but we can access a given int with i and j. Things will be way much simpler without malloc. Why do you need malloc? – manuell Dec 02 '13 at 18:57
  • 200 is the word count, 100 is the maximum characters count for one word. What is the problem? – manuell Dec 02 '13 at 18:59
  • I am coding a program which gives a file as an input, creates the index of it and prints it in a file as an output. I know that the variety of words is maximum 200. but I don't know how large the words are. how large the sentenses are. that's why I need to use malloc. – Melika Barzegaran Dec 02 '13 at 19:05
  • indicating my compiler is Turbo C.I am forced to use that. and the memory it uses is not enough to define an even 200 * 1000 array of char blocks. – Melika Barzegaran Dec 02 '13 at 19:07
  • @MelikaBarzegaranHosseini And what do you do with 1000 int by word? – manuell Dec 02 '13 at 19:13
  • I am coding a program which has a file as an input, creates the index of it and prints the output in the screen and in a output file. I need words array to hold words of the input file in it. and I need the index 2D array to hold which word, how many times is repeated in which line. 200 words * 1000 sentences. each block in this 2D array will show how many times this word is repeated in this sentence. – Melika Barzegaran Dec 02 '13 at 19:17
  • each of them is related to a sentence. if index[1][2] = 3 it would mean that the 1th word in 2th sentence has been repeated 3 times/ – Melika Barzegaran Dec 02 '13 at 19:19
  • 3 times in THAT sentences or 3 times in the whole document? – manuell Dec 02 '13 at 19:21
  • Can u explain words[i] = strdup(pszNewWord); more? – Melika Barzegaran Dec 02 '13 at 19:26
  • Last questions: 1) why do you talk about "line" above? 2) do you really need to only remember a sentence only by it's index? (and are you sure that your Turbo C will let you malloc 200*2Ko for index?) – manuell Dec 02 '13 at 19:29
  • strdup is like malloc, but automatically calculates the necessary length and do the strcpy for you. – manuell Dec 02 '13 at 19:31
  • Sorry, I must GO. See you tomorrow here. Leave a comment with some GMT time. Bye. – manuell Dec 02 '13 at 19:33
  • 1. see, there is a file. I read it line by line with fgets. for each line, I read word by word with strtok. so now I know the words. in index 2D array I find the place of the word. if it's existed before in word[last] array I just do index[last][sentence]++ if not, I shift the word array, do last++ and index[last][sentence]++. – Melika Barzegaran Dec 02 '13 at 19:36
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/42395/discussion-between-manuell-and-melika-barzegaran-hosseini) – manuell Dec 03 '13 at 10:03
  • I casted values and it got to be right. thanks really. SO really need such u people such eager who concern about other people problems. :) – Melika Barzegaran Dec 03 '13 at 14:11