-1

My problem is like this: I have to read a file which contains some string. The task is to read data and store in appropriate data structures in a C program.

Currently my program prints all values but accessing these variable is a problem ...

using namespace std;

int split(char* str, char splitstr[15][10]);
int main ()
{ 
  FILE *fp;

  char str[20] = {0}; // temp variable for accessing a line from file

 // for opening of file

  fp = fopen("C:\\Cross Crystal Sheet.csv", "r") ;

  char input[256];
  char result[15][10];
  char *protein[700];
  char p[1000];
  int j=0;

  if (NULL != fp) 
  {

    while(fgets(str,sizeof(str),fp)!=NULL)
    {
      strcpy(input, str);
      int count = split(input, result);
      int tmp=count;
      //j=result[0]-'0';
      for (int i=0; i<count; i++) 
      { 
        printf("%s\n", result[i]);
        //printf("%s\n",*(result+i));

        protein[j]=*(result+i);
          //*((protein)+j);
        printf("%s \n",*(protein+j));
        j++;
      }


     }

    }

}

int split(char* str, char splitstr[15][10])
{
   char* p;
   int i=0;
   char *string = strdup(str);
   p = strtok (string, ",");
  // i=i+count;

   while(p!=NULL)
   {
       strcpy(splitstr[i++], p);
       p = strtok (NULL, ",");

       if( p ==NULL)
        {
         break;
        }
     unsigned charlength = strlen(p);
     if(charlength==1 ||charlength==2 )
        {
          break;
        }
   }
   return i;
}

I am expecting output like this protein[]={1,ABL1,ABL2,AURKA,AURKB,...}

Data file is like this:

1,ABL1,ABL2,,,,
,,AURKA,,,,
,,AURKB,,,,
,,BMX,,,,
,,BTK,,,,
,,KIT,,,,
,,LCK,,,,
,,MAPK14,,,,
,,PRKACA,,,,
,,SYK,,,,
,,EGFR,,,,
,,INSR,,,,
,,MAPK11,,,,
,,,,,,
2,ABL2,ABL1,,,,
,,AURKA,,,,
,,AURKB,,,,
,,CAMK4,,,,
,,CDKL2,,,,
,,CLK3,,,,
,,CSNK1G3,,,,
,,KIT,,,,
,,LCK,,,,
,,MAPK14,,,,
,,PRKACA,,,,
,,SLK,,,,
,,SYK,,,,
,,,,,,
3,ACVR1,ACVR2A,,,,
,,ACVRL1,,,,
,,PIM1,,,,
,,PRKAA2,,,,
,,,,,,
4,ACVR2A,ACVR1,,,,
,,CAMK2D,,,,
,,MST4,,,,
,,PRKAA2,,,,
,,SLK,,,,
,,,,,,
5,AKT1,PRKACA,,,,
,,,,,,
,,,,,,
6,ALK,FES,,,,
,,MET,,,,
,,,,,,
7,AURKA,ABL1,,,,
,,ABL2,,,,
,,AURKB,,,,
,,CDK2,,,,
,,CHEK1,,,,
,,PLK1,,,,
,,PRKACA,,,,
,,,,,,
8,AURKB,ABL1,,,,
,,ABL2,,,,
,,AURKA,,,,
,,PRKACA,,,,
,,,,,,
9,BMX,ABL1,,,,
,,BTK,,,,
,,LCK,,,,
,,MAPK14,,,,
,,,,,,
10,BRAF,CDK8,,,,
,,KDR/VEGFR2,,,,
,,MAPK14,,,,
,,RAF,,,,
,,,,,,
Daniel Fischer
  • 181,706
  • 17
  • 308
  • 431

3 Answers3

1

According your code, I think the problem is that you did not allocate memory for 'protein[]', and you should allocate memory for every index of protein to store your strings.Secondly, copying a string from one place to anther is not just a simple assignment like this:

protein[j]=*(result+i);

using strncpy to do that.All the above is my analysis of your problem.

MYMNeo
  • 818
  • 5
  • 9
  • The original purpose of `strncpy` was to copy NUL-terminated filename strings into the fixed size (14 char) NUL-padded fields of the early UNIX filesystem. It's almost always the wrong tool for any other purpose. – Jim Balter Jun 19 '12 at 12:10
  • @JimBalter, why it is almost always the wrong tool?It is used to avoid to overflow the stack, and many common errors are caused by `strcpy`. – MYMNeo Jun 19 '12 at 12:18
  • I just told you: it NUL-pads, rather than NUL-terminates. The NUL padding is a waste of time and the lack of NUL-termination when n == strlen(src) is a bug begging to happen. P.S. You could have answered the question yourself by checking ... this site: http://stackoverflow.com/questions/869883/why-is-strncpy-insecure – Jim Balter Jun 19 '12 at 12:25
  • 1
    P.P.S. Here's what that C Standard Committee's Rationale says: "strncpy was initially introduced into the C library to deal with fixed-length name fields in structures such as directory entries. Such fields are not used in the same way as strings: the trailing null is unnecessary for a maximum-length field, and setting trailing bytes for shorter names to null assures efficient field-wise comparisons. strncpy is not by origin a ``bounded strcpy,'' and the Committee has preferred to recognize existing practice rather than alter the function to better suit it to such use." – Jim Balter Jun 19 '12 at 12:31
  • @JimBalter, thanks for all your comments.I often use `strncpy` to avoid to stackoverflow, and have not consider much more.And I alway use a pair of this:`strncpy(dst, src, len); dst[len - 1] = '\0';` – MYMNeo Jun 19 '12 at 12:39
  • The expected output should be a[699] ={1,ABL1,ABL2,...} – user1131484 Jun 19 '12 at 13:15
  • @user1131484, yes, I know what you want to do, and have you read all of comments?The problem is 1st, you did not allocate memory for every string, 2nd, your method of copying string is not correct. – MYMNeo Jun 19 '12 at 13:23
  • MYMNeo: He isn't copying the strings. He is using `result` for storage and assigning pointers into it. – Klas Lindbäck Jun 19 '12 at 14:54
  • @KlasLindbäck, I know what you means.But his titile and expected is to store the string in to an array.Not just to use pointers to point it. – MYMNeo Jun 19 '12 at 15:25
  • @MYMNeo If you are copying a short string into a large buffer, you are doing a lot of extra work and hitting a lot of memory unnecessarily ... and on top of that you're always copying or padding one more byte than necessary by using `len` rather than `len - 1`. I have a different approach: always use dynamically allocated arrays that can hold the source, or fail if the source is larger than the destination. In those very rare cases where neither of those is appropriate, I use snprintf. – Jim Balter Jun 19 '12 at 15:51
0

Your code takes a line and stores it in the variable result. You then assign protein to point into result.

The next iteration of your for loop overwrites result with the contents of the next line.

It is possible to declare a big chunk of memory statically, but it would probably be better to allocate memory dynamically, especially if you don't know the maximum size of the input file.

Klas Lindbäck
  • 33,105
  • 5
  • 57
  • 82
0

I have resolve the issues. Below is my code. Please suggest show can I make this code more generic.

using namespace std;

int split(char* str, char splitstr[16][11]);

int main ()
{ 
  FILE *fp;
  char str[20] = {0}; 
  fp = fopen("C:\\Cross Crystal Sheet.csv", "r") ;

  char input[256];
  char s[619][15];
  string str2, str3;
  char result[16][11];
  int j=0;
  if (NULL != fp) 
       {
            while(fgets(str,sizeof(str),fp)!=NULL)
             {
                strcpy(input, str);
                int count = split(input, result);
                int tmp=count;

                for (int i=0; i<count; i++) 
                   { 
                      str2 = result[i]; // "generalities"
                      char * cstr;
                      cstr = new char [str2.size()+1];
                      strcpy (cstr, str2.c_str());
                      strcpy (s[j], cstr);
                      j++;
                   }

             }
       }

   char ss[10] ={0};
   printf("Enter any main string to find \n");
   scanf("%10s",ss);
 //  printf("%d \n",atoi(s[16]));

  int temp=0;
  for (int k=0;k<j;k++)
  { 
       if (strncmp(s[k],ss,8)!=0)
         {
           temp=k; 
         }
       else
         {  
           int x=0;
           x=atoi(s[k-1]);
           if(x >=1 && x <=95)
            {
               printf("found at %d \n",k);
                    for(k=k;k<k+15;k++)
                        {
                          if (strncmp(s[k],"\n",2)!=0)
                             {
                              printf("%s \n",s[k]);
                             }
                          else
                             {
                              return 0;
                             }
                       }
             }
             else

         {
                 continue;
             }
            }

         }
   }


int split(char* str, char splitstr[16][11])
{
   char* p;
   int i=0;
   char *string = strdup(str);
   p = strtok (string, ",");


   while(p!=NULL)
   {
       strcpy(splitstr[i++], p);
       p = strtok (NULL, ",");

       if( p ==NULL)
        {
         break;
        }
     unsigned charlength = strlen(p);
     if(charlength==1 ||charlength==2 )
        {
          break;
        }
   }
   return i;
}
=======
Hauleth
  • 22,873
  • 4
  • 61
  • 112