-2

I'm trying to practice by writing a small, short program to read in a text file and put its contents into an array. I wanted to write this program to read in any text file as a string and store it into an array. This way, you can read any file and regardless of the string length, it will build an array dynamically and fill it in with a file. I'm using this as an exercise to practice with C and hopefully extrapolate this to other types and structs.

However, for some reason, my first entry does not match resulting in unexpected behavior, but at least it doesn't run into any seg faults. I understand that with C, you need to essentially micro manage all of your memory, and working with the code, I tried to allocate memory for each entry, but is this the correct approach? I ran the code through my head, and it makes sense logically when starting with 0 entries, but I don't understand why the first entry fails while the remaining entries work.

Code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]){

    //Initialize variables and pointers
    //Create an array of chars to use when reading in
    //Create an array of strings to store
    //i : use to keep track of the number of strings in array
    //j : loop variable
    //size: size of string
    char *s = (char *) malloc(sizeof(char));
    int i=0,j=0;
    int size = 0;
    char **a = (char **) malloc(sizeof(char *));

    //Read in string, assign string to an address at a[]
    while( scanf("%79s",s) == 1){

        //Get the size of the input string
        size = (unsigned) strlen(s);

        //Print some notes here
        printf("\nString is \"%-14s\"\tSize is %-3d, i is currently %d\n",s,size,i);
        printf("Using MALLOC with %d bytes\n",size+1);

        //Allocate memory to copy string
        //
        //For some reason, the commented code works
        //a[i] = (char *) (malloc(sizeof(char)*(size+1)) + 'a');
        a[i] = (char *) (malloc(sizeof(char)*(size+1)) );

        //Go and allocate memory for each character in string to store
        for(j=0; j<(size+1); j++)   a[i][j] = (char) (malloc(sizeof(char)));

        //Print some more notes here
        printf("Size: a[%2d] is %3d bytes, *a[%2d] is %3d bytes, Length of a[%2d] is %d\n",i,(int) sizeof(a[i]),i,(int) sizeof(*a[i]),i,(unsigned) strlen(a[i]));

        //Copy over string and set last char to be end
        for(j=0; j<size; j++)       a[i][j] = (char) s[j];
        a[i][size] = '\0';

        //Print it out and increase i
        printf("a[%3d] is now %s\n",i,a[i]);

        i++;
    }
    printf("I is now %d\n\n\n",i);
    a[i] = NULL;

    //print out array
    for(j=0; j<i; j++)      printf("%3d. %-40s\n",j,a[j]);


    return 0;
}

Test Text File (numbers.txt):

1
22
333
4444
55555
666666
7777777
88888888
9999999
0000000000
11111111111
222222222

Command:

./a.out < numbers.txt

Results:

String is "1             "      Size is 1  , i is currently 0
Using MALLOC with 2 bytes
Size: a[ 0] is   8 bytes, *a[ 0] is   1 bytes, Length of a[ 0] is 2
a[  0] is now 1

String is "22            "      Size is 2  , i is currently 1
Using MALLOC with 3 bytes
Size: a[ 1] is   8 bytes, *a[ 1] is   1 bytes, Length of a[ 1] is 3
a[  1] is now 22

String is "333           "      Size is 3  , i is currently 2
Using MALLOC with 4 bytes
Size: a[ 2] is   8 bytes, *a[ 2] is   1 bytes, Length of a[ 2] is 4
a[  2] is now 333

String is "4444          "      Size is 4  , i is currently 3
Using MALLOC with 5 bytes
Size: a[ 3] is   8 bytes, *a[ 3] is   1 bytes, Length of a[ 3] is 5
a[  3] is now 4444

String is "55555         "      Size is 5  , i is currently 4
Using MALLOC with 6 bytes
Size: a[ 4] is   8 bytes, *a[ 4] is   1 bytes, Length of a[ 4] is 6
a[  4] is now 55555

String is "666666        "      Size is 6  , i is currently 5
Using MALLOC with 7 bytes
Size: a[ 5] is   8 bytes, *a[ 5] is   1 bytes, Length of a[ 5] is 7
a[  5] is now 666666

String is "7777777       "      Size is 7  , i is currently 6
Using MALLOC with 8 bytes
Size: a[ 6] is   8 bytes, *a[ 6] is   1 bytes, Length of a[ 6] is 8
a[  6] is now 7777777

String is "88888888      "      Size is 8  , i is currently 7
Using MALLOC with 9 bytes
Size: a[ 7] is   8 bytes, *a[ 7] is   1 bytes, Length of a[ 7] is 9
a[  7] is now 88888888

String is "9999999       "      Size is 7  , i is currently 8
Using MALLOC with 8 bytes
Size: a[ 8] is   8 bytes, *a[ 8] is   1 bytes, Length of a[ 8] is 8
a[  8] is now 9999999

String is "0000000000    "      Size is 10 , i is currently 9
Using MALLOC with 11 bytes
Size: a[ 9] is   8 bytes, *a[ 9] is   1 bytes, Length of a[ 9] is 11
a[  9] is now 0000000000

String is "11111111111   "      Size is 11 , i is currently 10
Using MALLOC with 12 bytes
Size: a[10] is   8 bytes, *a[10] is   1 bytes, Length of a[10] is 12
a[ 10] is now 11111111111

String is "222222222     "      Size is 9  , i is currently 11
Using MALLOC with 10 bytes
Size: a[11] is   8 bytes, *a[11] is   1 bytes, Length of a[11] is 10
a[ 11] is now 222222222
I is now 12


  0. ▒"▒
  1. 22
  2. 333
  3. 4444
  4. 55555
  5. 666666
  6. 7777777
  7. 88888888
  8. 9999999
  9. 0000000000
 10. 11111111111
 11. 222222222
  • 5
    Undefined behavior for using the value of an object with automatic storage duration while it is uninitialized. Undefined behavior for passing incorrect argument-type to a varargs-function. Undefined behavior for `fflush()`ing a non-output stream. – EOF Jul 08 '16 at 21:41
  • 3
    You need to do more than declare a pointer... you also need to allocate memory for the strings you read (and the array of pointers to them). You also may need to change the size of those allocations as you go if you don't know in advance how much space you'll need. – Dmitri Jul 08 '16 at 21:46
  • When you use `scanf` with strings, you don't need the `&`. Each element in your array `a` is a string, `a[0]` is the first `a[1]` the second, etc. When you print a string, you just use the _name_ of the string/char array (the pointer to the first element in the char array), so to print a string (`char *`) that is an element in array `a`, you would just do `printf("my 1st str is: %s\n", a[0]);`. Also, you'd never use `&` for printing a string (or a pointer-to-struct int, e.g.), you would instead use the `*` operator if necessary (but again, not with strings because it _expects_ the ptr itself). – RastaJedi Jul 08 '16 at 22:55
  • 1
    The basic Idea of what you implemented is correct. But, there are still some few issues like the main one being you didn't allocate memory to 'a'. Go throught this link [http://stackoverflow.com/questions/19068643/dynamic-memory-allocation-for-pointer-arrays](http://stackoverflow.com/questions/19068643/dynamic-memory-allocation-for-pointer-arrays) to learn how to allocate memory dynamically for array of pointers. Go through all the answers provided there. Also, try to use fgets() instead of scanf() for strings. – user4925 Jul 08 '16 at 23:44
  • "Is this because of how memory is being allocated?" Nowhere in this code is any memory being allocated. What did you think would happen when you declared a pointer, that C would just psychically know how much memory you wanted? – underscore_d Jul 09 '16 at 00:55

2 Answers2

1

For starters: your declaration char **a; declares a pointer; of the correct type, but pointing into nowhere.

Your current code is writing into some undefined region of the memory, and that sometimes works and sometimes dumps; but the effect of doing that is undefined and mostly unpredictable.

In C, you need to handle your own memory. So either, you declare the memory you want this to pointer to point to right with it: char a[255][10]; (for 255 chars in each string and 10 of them), or you allocate it on the heap (with malloc) and assign it to the pointer: a = (char * *) malloc(sizeof(char *) * 10); (again to get 10), and then for each of them: a[i] = (char *) malloc(sizeof(char) * 255);.

The latter allows you to use variables for the sizes, if you want, and also allows much larger memory chunks that the first way.

Once you understand this, apply it to the other variables you want to use. Remember, each pointer you declare is just that, a pointer. It is your job to make it point to something useful, and to provide memory at that location.

Aganju
  • 6,295
  • 1
  • 12
  • 23
  • `you declare the memory you want this to pointer to point to right with it: char a[10][255]` - no, you're not declaring a pointer here, you're declaring an array of stack (automatically) allocated arrays. worse, it's actually 255 arrays each of size 10, not what you said. – underscore_d Jul 09 '16 at 00:58
  • corrected the sequence issue. But `a` is equivalent to a pointer afterwards, and can be used as such. – Aganju Jul 09 '16 at 01:01
  • No, `a` is the name of a _value_. The fact that array names often implicitly convert to pointers is not directly relevant in the context of declaration, which is what you're describing. The former have some essential powers that pointers don't. Yes, you can then pass array names to functions expecting pointers, to which they'll silently "decay", but they're not pointers. – underscore_d Jul 09 '16 at 01:02
  • I know and agree. I am trying to skip some finer points for beginner answers. I was estimating the OP's level of expertise to be in a range where he would be only confused by such details, especially as they are not relevant for his current issue. You wouldn't explain integrals in middle school, would you? – Aganju Jul 09 '16 at 01:30
  • You seem to think that on Stack Exchange, answers are for the OP specifically, at this current moment in time. They aren't: they should serve all readers as the site aims to build a knowledgebase of good info. You wouldn't write a math textbook addressed to one student and consisting solely of introduction. Hence I commented to clarify, so the OP or any other/future readers can benefit from accurate info, not just easy-to-digest-right-now info. You're welcome not to incorporate such comments into your answer if you don't want to, but they're for everyone else to read too. – underscore_d Jul 09 '16 at 01:37
  • Of course they are for all. But anyone else having a similar problem would need help at a similar level. Someone who has more knowledge already wouldn't need the kind of explanantion I gave. To stay with your textbook example, you also don't only write college level textbooks, but have easier ones too. The collection of _all of them_ allows every level of student to progress, by looking at the book that is right for his level. Stackexchange is such a collection of 'books', so all levels of questions and answers are collected. – Aganju Jul 09 '16 at 02:06
  • "Someone who has more knowledge already wouldn't need the kind of explanantion I gave." Obviously, but they're not reading this thread looking to learn anything new - and not who I was talking about. Among people who are, someone who needed your simplified explanation just a second ago might need the clarification about array types very soon. And now we have both, for whoever needs them, whenever. Great! – underscore_d Jul 09 '16 at 02:16
1

My attempt at one of the best ways to implement this would be using linked lists; each line would be considered a "node" for the list.

I suggest a linked list because you seem to want to be able to read the information in regardless of the number of lines in the file. Therefore, a dynamically allocated array would be useful as you mention storing the numbers separately in an array.

No matter the number of lines, the information will be stored in the linked list (given your computer has enough memory available).

Code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define FILENAME file.txt
//While the amount of chars in each line is limited, are you really going to use more than 1000 per line?
#define MAX_LINE_SIZE 1000
struct node {
  char data[MAX_LINE_SIZE];
  struct node* next;
};


void insert_node (struct node **head, char *nodeValue, size_t line_max_size);
void print_list (struct node *head);

int main(int argc, char const *argv[]) {
  //Open the file
  FILE *file_to_open;
  file_to_open = fopen("numbers.txt", "r");
  if(!file_to_open) {
    return 1;
  }
  char currentLine[MAX_LINE_SIZE] = {0};

  struct node* headNode;
  headNode = NULL; //Initialize our first node pointer to be NULL

  while (fgets(currentLine, sizeof(currentLine), file_to_open)) {
    insert_node(&headNode, currentLine, sizeof(currentLine));
  }

  printf("Array list from your file:\n");
  print_list(headNode);

  fclose(file_to_open);
  return 0;
}

void print_list (struct node *head) {
  //We define a new 'currentNode' instead of just using headNode so we don't modify the headNode
  struct node* currentNode = head;

  while (currentNode != NULL) {
      printf("Value: %s", currentNode->data);
      currentNode = currentNode -> next;
  }
}

void insert_node (struct node **head, char *nodeValue, size_t line_max_size) {
  struct node *currentNode = malloc(sizeof(struct node));

  strncpy(currentNode->data, nodeValue, line_max_size);
  currentNode->next = *head;

  /*
  * In order to understand how we add nodes, let's take a look at possible senarios:
  * 1. The list is empty, so we need to add a new node
  * In which case, our memory looks like this where HEAD is a pointer to the first node:
  *  -----
  * |HEAD | --> NULL
  *  -----
  * The line currentNode->next = headNode; will NOT do anything since headNode originally
  * starts out at a value of NULL
  *  Now, we want to set our current node to be the (new) head node
  *  -----      -------------
  * |HEAD | --> |CURRENTNODE| //The head node points to the current node
  *  -----      -------------
  * This is done with headNode = currentNode; (shown below)
  * So, headNode now points directly to the node structure we just created
  *
  * 2. The list is already populated; we need to add a new node to the beginning
  * For the sake of simplicity, let's start out with 1 node:
  * ------  --------------------
  * HEAD  -> FIRST NODE --> NULL
  * ------  --------------------
  * With currentNode->next = headNode, the data structure looks like this:
  * ---------        -----  ---------------------
  * currentNode -->  HEAD -> POINTER TO FIRST NODE
  * ---------        -----  ---------------------
  * Which, obviously needs to be altered since the currentNode should then
  * become the first node & the head should point to the first node:
  * ----  ---             ----------------------------
  * HEAD -> currentNode -->         2nd NODE
  * -----  --             ----------------------------
  * This is done with head = currentNode
  * Note: In this explanation, I did not explain *head = currentNode like in the actual code
  * This was to make the explanation simpler, although we do need to
  * dereference once to access head through the pointer to pointer
  */
  *head = currentNode;
}

I tried to explain how I inserted nodes in a linked list to the best of my ability in the comments, but clarification on these issues can always be found online.

The output of the program looks like this:

    Array list from your file:
    Value: 222222222
    Value: 11111111111
    Value: 0000000000
    Value: 9999999
    Value: 88888888
    Value: 7777777
    Value: 666666
    Value: 55555
    Value: 4444
    Value: 333
    Value: 22
    Value: 1

Technically, each line cannot be more than 1000 characters in this program, but it is rare to need more per line.

Also note that this program prints out the linked list (and order of numbers.txt) backwards since we are continuously adding nodes to the beginning of the list. However, it should be fairly simple to reverse the order of the linked list.

iRove
  • 562
  • 1
  • 9
  • 19