3

I am writing a c program that opens a txt file and want to read the last line of the txt file. I am not that proficient in C so bear in mind that I may not know all of the concepts in C. I am stuck at the part where I use fscanf to read all the lines of my txt file but I want to take the last line of the txt file and get the values as described below.

Here is my incomplete code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

FILE *sync;


void check()
{
    int success; //to hold the results if the timestamps match
    sync = fopen("database.txt","r");
    char file[] = "database.txt";

    while (fscanf(sync, "%d.%06d", &file) != EOF)
    {


    }

    fclose(sync);
}

sample txt file:

/////// / //// ///// ///// //////////////// Time: 1385144574.787665 //////// /
/////// / //// ///// ///// //////////////// Time: 1385144574.787727 //////// /
/////// / //// ///// ///// //////////////// Time: 1385144574.787738 //////// /
/////// / //// ///// ///// //////////////// Time: 1385144574.787746 //////// /
/////// / //// ///// ///// //////////////// Time: 1385144574.787753 //////// /

The / are some words, symbols and numbers I do not want, just the numbers in sample txt as shown above

I appreciate any examples and pointing out errors I made so I can understand this much better. Since I made some people confused about the text file, here is what it really is. This is the format it will be so I should know the length of each line. However, I will not be able to know how many lines there will be as it may be updated.

 Socket: 0 PGN: 65308 Data: 381f008300000000 Time: 1385144574.787925 Address: 28
 Socket: 0 PGN: 65398 Data: 0000000100000000 Time: 1385144574.787932 Address: 118
 Socket: 0 PGN: 61444 Data: f07d83351f00ffff Time: 1385144574.787940 Address: 4
 Socket: 0 PGN: 65266 Data: 260000000000ffff Time: 1385144574.787947 Address: 242
 Socket: 0 PGN: 65309 Data: 2600494678fff33c Time: 1385144574.787956 Address: 29
 Socket: 0 PGN: 65398 Data: 0000000100000000 Time: 1385144574.787963 Address: 118
 Socket: 0 PGN: 61444 Data: f07d833d1f00ffff Time: 1385144574.787971 Address: 4
 Socket: 0 PGN: 65398 Data: 0000000100000000 Time: 1385144574.787978 Address: 118
 Socket: 0 PGN: 61443 Data: d1000600ffffffff Time: 1385144574.787985 Address: 3
 Socket: 0 PGN: 65308 Data: 451f008300000000 Time: 1385144574.787993 Address: 28
 Socket: 0 PGN: 65317 Data: e703000000000000 Time: 1385144574.788001 Address: 37

Again I am after the Time values (eg. 1385144574.787925) at the last line of the txt file. Hope this helps.

Mitaksh Gupta
  • 1,029
  • 6
  • 24
  • 50
GhostMember
  • 91
  • 1
  • 2
  • 9
  • In order to help you to "understand it much better" one needs to know why you did it that way in the first place. What were you trying to do by passing that `file` string with `"database.txt"` in it to `fscanf`? What was the purpose of that? – AnT stands with Russia Nov 28 '13 at 14:51
  • You have asked a question with two big parts: 1, how to look at only the last line (Elias answer is good). 2, how to extract Time element from that line. Without knowing what exactly you mean by ***The / are some words, symbols and numbers I do not want***, there is no way to know exactly how to help you. Words, symbols and ***numbers*** can include any ascii character. If you want to exclude all other input except _Time: 1385144574.787665_ et. al., then you will have to parse the line to _exclude_ all unwanted data using `strtok()` (or `strtok_r()`). – ryyker Nov 28 '13 at 16:03
  • If you provide a _complete_ example of some input lines, then you will likely get a complete solution. At the time of this comment, the answerers could not possibly address this completely. – ryyker Nov 28 '13 at 16:11

2 Answers2

7

Since you're after the last line of the file, and you didn't mention how large the file might be, it could be worth while to start reading the file from the end, and work your way backwards from there:

FILE *fp = fopen("database.txt", "r");
fseek(fp, 0, SEEK_END);//sets fp to the very end of your file

From there, you can use fseek(fp, -x, SEEK_CUR); where x is the number of bytes you want to go back, until you get to where you want... other than that, Jekyll's answer should work just fine.
However, to get the last line, I tend to do something like this:

FILE *fp = fopen("database.txt", "r");
char line[1024] = "";
char c;
int len = 0;
if (fp == NULL) exit (EXIT_FAILURE);
fseek(fp, -1, SEEK_END);//next to last char, last is EOF
c = fgetc(fp);
while(c == '\n')//define macro EOL
{
    fseek(fp, -2, SEEK_CUR);
    c = fgetc(fp);
}
while(c != '\n')
{
    fseek(fp, -2, SEEK_CUR);
    ++len;
    c = fgetc(fp);
}
fseek(fp, 1, SEEK_CUR);
if (fgets(line, len, fp) != NULL) puts(line);
else printf("Error\n");
fclose(fp);

The reasoning behind my len var is so that I can allocate enough memory to accomodate the entire line. Using an array of 1024 chars should suffice, but if you want to play it safe:

char *line = NULL;
//read line
line = calloc(len+1, sizeof(char));
if (line == NULL)
{
    fclose(fp);
    exit( EXIT_FAILURE);
}
//add:
free(line);//this line!
fclose(fp);

Once you've gotten that line, you can use Jekyll's sscanf examples to determine the best way to extract whatever you want from that line.

Elias Van Ootegem
  • 74,482
  • 9
  • 111
  • 149
  • File may contain over a thousand lines and it may increase when I update it – GhostMember Nov 29 '13 at 04:51
  • Well, in that case you should definitely read the file backwards. Considering you're only interested in the last line of the file... – Elias Van Ootegem Nov 29 '13 at 07:06
  • @EliasVanOotegem I gave you +1 because I always used that and I like it. By the way I didn't propose that because the latest time I tried someone complained that "according to the standard" seeking of a number of position is not safe (even if it works always in Linux/Unix). Do you know something about that? I couldn't find this advertisement anywhere. – Jekyll Nov 29 '13 at 08:50
  • 1
    @Jekyll: `fseek((void *), int, SEEK_END)` results in undefined behaviour when used on a binary stream. Which is not the case here, [there's a pretty good summary here](https://www.securecoding.cert.org/confluence/display/seccode/FIO19-C.+Do+not+use+fseek()+and+ftell()+to+compute+the+size+of+a+regular+file). If you add an example that _doesn't_ include reading the entire file (like your `while (fgets())` does now, I'd be more than happy to upvote your answer, BTW... It's by far the more complete, and comprehensive answer of the two – Elias Van Ootegem Nov 29 '13 at 08:56
  • @EliasVanOotegem thanks for the summary. So that's true ***only for binary*** stream. That makes sense. – Jekyll Nov 29 '13 at 08:59
  • @Jekyll: Something just came to mind: windows is, as of late, all UTF-16, I seem to recall. This would mean that all chars read from the file are `wchar_t` or unsigned chars or something... I don't have a windows machine available (nor do I want one :P), and I don't know how EOF works on a win system (is it a real char, or just -1?) – Elias Van Ootegem Nov 29 '13 at 09:02
  • @EliasVanOotegem the guy was complaining about the fact that couldn't work on windows sometimes. Infact I wrote it works on ***linux/Unix*** always. I may try as I have development tools for windows even though I never work on win machines. – Jekyll Nov 29 '13 at 09:03
  • @Jekyll: Yes, I got that, but I think, on a win system (being utf-16), `fseek(fp, -2, SEEK_CUR)` might not rewind 2 chars (it rewinds 2 _bytes_ => 1 utf-16 char). So you might have to check for null bytes, or considering you're on a windows system: checking the BOM might be faster, easier and more reliable, and then set a multiplier for all your `fseek` calls – Elias Van Ootegem Nov 29 '13 at 09:07
  • @EliasVanOotegem yes but it should still be **safe** if you know what you are doing :). That is my assumption. By the way I will need to investigate more before posting another help using fseek :). – Jekyll Nov 29 '13 at 09:08
  • @Jekyll: Oh, of course... it'll behave as expected and documented. If you _know_ (as everybody should) about character encoding, and write your code in such a way that it can deal with that, then you shouldn't have anything to worry about – Elias Van Ootegem Nov 29 '13 at 09:10
  • @EliasVanOotegem so you agree with me that claiming an ***undefined*** behavior is not correct as the behavior is defined, but the usage you do with the function maybe wrong. – Jekyll Nov 29 '13 at 09:12
  • @Jekyll: Yes, to the best of my knowledge, _unless_ you've opened the file in binary mode, a stream (like `stdin`) that needn't _end_ as such, or strema lacking init. shift state @end. To quote the official standard: _"Setting the file position indicator to end-of-file, as with `fseek(file, 0, SEEK_END)`, has undefined behavior for a binary stream (because of possible trailing null characters) or for any stream with state-dependent encoding that does not assuredly end in the initial shift state. and A binary stream need not meaningfully support `fseek` calls with a whence value of `SEEK_END`"_ – Elias Van Ootegem Nov 29 '13 at 09:19
4

The way you are using fscanf is wrong as the actual vector of arguments needs to match what you are collecting (as you can see in the manpage). Instead of using fscanf you may consider using fgets and then filtering for what you are looking for in the latest raw with a regex through sscanf.

Note:: I collected the value in double format, you may choose the format that suits you the most for your problem (string?int.int?float?), in order to do this you should check for regex using scanf. Please come back if you cannot accomplish this task.

update:: due to some requests I wrote some few examples of different pattern matching. These should be a good starting point to fix your problems.

update:: 1. I have seen that you added the pattern of your db file so we can now state that both #3 and #4 match and put the 3 here (faster). 2. I removed the feof check as for your request, but note that the check is fine if you know what you are doing. Basically you have to keep in mind that stream's internal position indicator may point to the end-of-file for the next operation, but still, the end-of-file indicator may not be set until an operation attempts to read at that point. 3. You asked to remove the char line[1024]={0,}; This instruction is used to initialize the line[1024] array which will contain the lines that you read from the file. This is needed! To know what that instruction is please see here

Code:

void check()
{
   char line[1024]={0,}; // Initialize memory! You have to do this (as for your question)
   int n2=0;
   int n3=0;
   sync = fopen("database.txt", "r");
   if( sync ) {
      while( fgets(line, 1024, sync) !=NULL ) {
      // Just search for the latest line, do nothing in the loop
      } 
      printf("Last line %s\n", line); //<this is just a log... you can remove it
      fclose(sync);
      // This will look for Time and it will discard it collecting the number you are looking for in n2 and n3
      if (sscanf(line, "%*[^T]Time: %d.%d", &n2, &n3) ) {
          printf( "%d.%d\n", n2, n3);
      }
   }
}

Example 2
if for instance you need to collect the value using two integers you will need to replace the sscanf of the example above with the following code:

  unsigned int n2, n3;
  if (sscanf(line, "%*[^0-9]%d.%d", &n2, &n3) ) {
    printf( "%d.%d\n", n2, n3);
  }

said this you should figure out how to collect other formats.

Example 3 A better regex. In case there are others number in the file before the giving pattern you may want to match on Time, so let's say that there isn't any T before. A regex for this can be:

 if (sscanf(line, "%*[^T]Time: %d.%d", &n2, &n3) ) {
    printf( "%d.%d\n", n2, n3);
}

The regex using sscanf can be not suitable for your pattern, in that case you need to consider the usage of gnu regex library or you can mix strstr and sscanf like I did in the following example.

Example 4 This can be useful if you don't find a common pattern. In that case you may want to trigger on the string "Time" using strstr before calling the sscanf

  char *ptr = strstr( line, "Time:" );
  if( ptr != NULL ) {
     if (sscanf(ptr, "%*[^0-9]%d.%d", &n2, &n3) ) {
        printf( "%d.%d\n", n2, n3);
     }
  }

* Note * You may need to find your way to parse the file and those above can be only suggestions because you may have more specific or different patterns in your file but the instruction I posted here should be enough to give you the instruments to do the job in that case

Community
  • 1
  • 1
Jekyll
  • 1,434
  • 11
  • 13
  • Accordong to the comments, the OP wants to perform matching of *timestamps*. It should be exact. Your attempt to use `double` will distort the values and prevent exact mathing. – AnT stands with Russia Nov 28 '13 at 14:48
  • @AndreyT I think that if this is what he needs he should play around sscanf and collect the correct form he is looking for (maybe a string, maybe a int.int?). I suspect that as a guide this snippet should be enough. Do you agree? – Jekyll Nov 28 '13 at 14:52
  • 1
    Well, the `%*[0-9]` is a good thing for the OP to learn. However, it looks like someone has already helped the OP to use the `%d.%d` approach. Discarding it without any explanation will probably confuse the OP. – AnT stands with Russia Nov 28 '13 at 14:54
  • @AndreyT ok I take your point, I will add also the snippet for int.int so that I can abandon this post – Jekyll Nov 28 '13 at 15:08
  • 1
    This looks like undefined behavior. If the file is empty, line[] is uninitialized in the printf. **The dreaded Pascal disease of using `while(!feof(fp))` is almost always a bug**. – Jens Nov 28 '13 at 15:17
  • Ok @Jens I don't think that the point was to provide him the code with all checks and initializations required, but that a prove of concept and a guide was needed. I can add some checks, but I think that he already knows that he has to check for file pointer!=NULL, he has to initialise arrays and so on. – Jekyll Nov 28 '13 at 15:50
  • @Jekyll - How does this extract _only_ the data OP asks for considering he states: ***The / are some words, symbols and numbers I do not want, just the numbers in sample txt as shown above*** ?? You did not address this. – ryyker Nov 28 '13 at 16:06
  • @ryyker I gave him a guide because the input was confusing. I provided him the way to build his own regex using sscanf and I simply made an assumption for a given pattern. I also asked him to come back if he think that is the wrong pattern and he cannot find a solution. It is difficult to make assumptions on the pattern not having the possibility to check the file content. A simple pattern could be looking for "Time: " but again I am not sure that is only there, but it is a good idea to add it – Jekyll Nov 28 '13 at 16:19
  • @ryyker I added a version which maybe more generic and can be used if you don't find a common pattern in your file. This assumes that the string "Time:" is present only once. – Jekyll Nov 28 '13 at 16:58
  • Thanks for the examples. Although, I probably should have mentioned that the "Time: " is also something I do not want. It is the "1385144574.787665" I am after. Still, I appreciate the help and will look at it. – GhostMember Nov 29 '13 at 00:11
  • @user3046173 Time is discarded by the output... it is only matched in some of the examples I reported (it is used as a starting token for collecting the two numbers). I suppose that the one you are looking for is either ex3 or ex4. – Jekyll Nov 29 '13 at 00:12
  • @Jekyll- I am testing your example to see how this gets my desired value from the last line of my txt file. I am using ex4 since I want to see if the value is correct or not. I have a couple issues. First, the char line[1024] = {0,} is something I have to change, but is this about determining the length of the line in the txt? My next question is the !feof, I hear I should avoid using it. – GhostMember Nov 29 '13 at 06:59
  • @user3046173 line[1024] = {0,} is an initialization: you need to initialise the memory you are going to use. You can remove the feof (also if in this case is safe and just use while( fgets( ) != NULL ). I have seen your pattern and the #3 is fine as well. I am going to comment in the answer – Jekyll Nov 29 '13 at 08:00
  • @user3046173 i modified the algo for you. Just copy and paste that – Jekyll Nov 29 '13 at 08:11
  • @Jekyll- After looking at your edits, I now get the need for the line[1024] = {0,}, I also have a couples question, when you put if(sync), I assume that if the fopen returns a NULL, the condition will not pass. This means I can put if(sync != NULL). also for the while loop, this loop will have fgets to get each line in the file. My question is the printf and the close inside the loop? Should they be outside? Sorry if I look dumb, I am trying to get this. – GhostMember Nov 29 '13 at 09:10
  • @user3046173 if(sinc) and if(sinc!=NULL) are actually the same. if(sync!=NULL) is fine! The printf is outside the loop. Do you see that semicolomn after the while? Well that's the only instruction in the loop. Whenever you don't put curly brackets your condition will be applied only to the next instruction which is ';' in this case. I am going to add curly brackets to help you. – Jekyll Nov 29 '13 at 09:40