0

I am trying to read in a string of whitespace separated strings from a file that goes like:

CGGCGGGAGATT CGGGAGATTCAA CGTGCGGCGGGA CGTGGAGGCGTG CGTGGCGTGCGG GCGTGCGGCGGG GCGTGGAGGCGT GCGTGGCGTGCG GGAGAAGCGAGA GGAGATTCAAGC GGCGGGAGATTC GGGAGATTCAAG GTGCGGCGGGAG TGCGGCGGGAGA

My code to achieve this is

char id_dna_seqs[14][12];
fscanf(dataset, "%s %s %s %s %s %s %s %s %s %s %s %s %s %s", id_dna_seqs[0], id_dna_seqs[1], id_dna_seqs[2], id_dna_seqs[3], id_dna_seqs[4], id_dna_seqs[5], id_dna_seqs[6], id_dna_seqs[7], id_dna_seqs[8], id_dna_seqs[9], id_dna_seqs[10], id_dna_seqs[11], id_dna_seqs[12], id_dna_seqs[13]);

but when I do a test printout of the array, I don't get what I expect to get. For example, doing

printf("%s\n", id_dna_seqs[4]);

gives

CGTGGCGTGCGGGCGTGCGGCGGGGCGTGGAGGCGTGCGTGGCGTGCGGGAGAAGCGAGAGGAGATTCAAGCGGCGGGAGATTCGGGAGATTCAAGGTGCGGCGGGAGTGCGGCGGGAGA

which upon closer examination, I realized is actually printing all of the strings starting from the 5th element in the char* array. What I want to achieve is to be able to individually index each string correctly for example, with reference to the strings in the file, the 5th string is the sequence CGTGGCGTGCGG, so I want printing id_dna_seqs[4] to give me just that instead of everything starting from the 5th string element.

Please let me know what is wrong here, and I look forward to your suggestions for improvement. Thank you!

AKKA
  • 165
  • 4
  • 15
  • 2
    C-"string"s need to provide one ***more*** char to store the `0`-terminator. 14 chunks with 12 characters each need: `char id_dna_seqs[14][13];` This assumes a non-multi-byte character text file. – alk Jun 04 '16 at 17:05
  • 1
    Why [12]? Surely, you didn't count up? If you count up, you will make mistakes, eg. forgetting to allow space for null terminators... Just look at the input and say 'OK, 32 would be fine for that dimension'. – Martin James Jun 04 '16 at 17:05
  • @alk Over, and over, and over again with the bean-counters:(( – Martin James Jun 04 '16 at 17:07
  • There is nothing wrong with bean-counting in the 1st place, even more with refusing to RTFM. @MartinJames – alk Jun 04 '16 at 17:09
  • 1
    @alk : I've definitely learnt about the null terminated string thing in C before, and that the printf() function is designed to recognize the null termination before to stop printing. But I didn't realize that I would need to ensure my array is one more character larger to account for that! That puzzles me now because then why is it that the null character got stored in the end of id_dna_seqs[13], even though that last string was also only allocated 12 characters only? (which is probably the case since printf() knew that it had to stop at the last string)? – AKKA Jun 04 '16 at 17:31
  • Alternative to "`n+1` allocating and `\0` terminating" would another kind of "bean-counting": `for(int i = 0; i < 12; i++)printf("%c", id_dna_seqs[4][i]);` or whatever you may feel like abstracting and wrapping it into. In that case you don't deal with strings anymore (-: – user3078414 Jun 04 '16 at 17:45
  • 1
    @alk there is a lot wrong with it . Every day, there are SO posts with obi-wan errors because the posters spend time on laboriously counting chars to precisely size their buffers..and get it wrong. Those who shove in [128] without thinking too hard about it don't post SO questions:) – Martin James Jun 04 '16 at 18:58
  • @martinjames: And those guys placing those angst -bytes then end up with voodoo-code. :-( – alk Jun 04 '16 at 19:01
  • @user3078414: Cortect, but 'scanf' then would invoke undefined behaviour by placeing the NUL out -of -bounds. – alk Jun 04 '16 at 19:04
  • @akka: That the printf found a 0 at the expected place was pure "bad" luck, hiding the malfunction. – alk Jun 04 '16 at 19:10
  • @alk : That clarifies then! So technically it shouldn't have been null terminated as I did not correctly specify my array size to include that location of memory, but somehow I got "lucky" that it was a 0 lol. Also if its possible for you to explain here, do you have any suggestions on a better way that I could have read in the DNA strings? I kind of feel that what I've implemented here is not really the best way possible... – AKKA Jun 05 '16 at 22:38

0 Answers0