2

i'm starting in C programming and I have this issue in a school project. I have a text file with contracts that looks like this:

609140307   Carla Aguiar Cunha Paredes Pires    PT 309 181 020 533 713 02F  13.8
814991297   Ricardo Andrade Nogueira Matos  PT 099 597 635 807 514 05D  10.35
843818099   Eduardo Carneiro Paredes Clementino Castro  PT 829 961 009 571 587 02D  5.75
647507641   Cristiana Eanes Almada Martins Baptista PT 257 687 479 093 378 02E  10.35
684741046   Marisa Calado Cardoso Quadros Barbosa   PT 722 479 016 817 208 0RC  10.35
...

The fields are separated by a tab and it's around 10.000 lines of contracts

I need to store every line to a struct. This is what I've done:

#include <stdio.h>

typedef struct {
    char id_contract[10];
    char name[60];
    char id_local[26];
    char power[5];
}CONTRACTS;

void main() {
    CONTRACTS c[10000] = { 0 };
    int i = 0;
    FILE *file = fopen("contracts.txt", "r");
    if (file)
    {
        char line[120];
        while (fgets(line, sizeof line, file) && i < 5)
        {

            if (sscanf(line, "%9s%60s%26s%5s",
                c[i].id_contract,
                c[i].name,
                c[i].id_local,
                c[i].power) == 4)
            {
                printf("Contract ID = %s\n", c[i].id_contract);
                printf("Name = %s\n", c[i].name);
                printf("Local ID = %s\n", c[i].id_local);
                printf("Power = %s\n", c[i].power);
                ++i;
            }
        }
     else {
        printf("Error!\n");
    }
}

And this is the output I get:

Contract ID = 609140307
Name = Carla
Local ID = Aguiar
Power = Cunha
Contract ID = 814991297
Name = Ricardo
Local ID = Andrade
Power = Nogue
Contract ID = 843818099
Name = Eduardo
Local ID = Carneiro
Power = Pared

So basically this is separating the fields by space and I don't know how to make it separate by a tab. I'm a beginner so it's difficult for me. Thank you in advance!

  • Possible duplicate of [How do you allow spaces to be entered using scanf?](https://stackoverflow.com/questions/1247989/how-do-you-allow-spaces-to-be-entered-using-scanf) – fvu Feb 03 '18 at 18:42
  • You have another error too. Member `char id_contract[10];` is correctly limited to `9` input length but other members are not. Worse, `char power[5];` cannot hold the string `10.35` which requires `char power[6];` to also hold the `nul` terminator. – Weather Vane Feb 03 '18 at 18:52
  • Please note too, that `%s` format specifier causes `scanf` family to stop scanning at the first whitespace. – Weather Vane Feb 03 '18 at 18:54
  • @WeatherVane Changed it, now it doesn't even run, just crashes. – rafaoliveira35 Feb 03 '18 at 18:55
  • @WeatherVane What alternative should I do then? – rafaoliveira35 Feb 03 '18 at 18:56
  • 1
    Consider using `%[^\t]` if each field *content* is separated by spaces, but the fields themselves are tab-separated. But you will need to remove the `\t` from the input somehow. – Weather Vane Feb 03 '18 at 18:56
  • @WeatherVane So it should be something like this?: sscanf(line, %[^\t]s %[^\t]s %[^\t]s ..... ??? The program doesn't print anything – rafaoliveira35 Feb 03 '18 at 19:00
  • I didn't mention any `s` after the format specifier `%[^\t]`. You should also ***always*** check the return value from `scanf` function family: the number of items scanned. – Weather Vane Feb 03 '18 at 19:03
  • @WeatherVane Ok changed the sscanf to"%[^\t] %[^\t] %[^\t] %[^\t]" This is my output id contrato = 128786512 nome = Ricardo Tinoco Belchior Caneco Pinto id local = PT 663 373 855 524 457 0RC10.35 potencia = 10.35 id contrato = 995099612 nome = Clarice Proenca Brito Moreira Carmona id local = PT 817 717 708 573 823 0RC6.9 potencia = 6.9 id contrato = 264112040 nome = Daniel Carvalheira Amorim Moreira Rego id local = PT 051 229 298 816 284 0RC4.6 potencia = 4.6 Theres a problem in the LocalID. It doesn't separate the Power. Sry if the variables are in other language – rafaoliveira35 Feb 03 '18 at 19:05
  • The `scanf` family is quite tricky to use. I suggest you read each line with `fgets` and then break it into its parts with `strtok` or cousins, using a delimiter character set of `"\t\n"`. – Weather Vane Feb 03 '18 at 19:06
  • @WeatherVane I have no idea how to use or apply strtok in my program :/ – rafaoliveira35 Feb 03 '18 at 19:08
  • I should think you can find some examples though... perhaps on the man pages, perhaps in Stackoverflow. – Weather Vane Feb 03 '18 at 19:08
  • The most robust way is to read character-wise (checking for`\t` , `\n` and EOF) and maintain sufficient state(field, field width, line number) – wildplasser Feb 03 '18 at 19:10
  • Have you corrected the lengths of the arrays as per my first comment? – Weather Vane Feb 03 '18 at 19:10
  • Note that the input `PT 829 961 009 571 587 02D` has length 26, and 1 for the string terminator, so `char id_local[26];` ==> `char id_local[27];`. Don't be mean with array lengths during testing. Make them twice the length, until you get it working. Memory is not as precious as in yore. – Weather Vane Feb 03 '18 at 19:15

3 Answers3

1

In your typedef correct the last 2 variables in:

char id_local[27];
char power[6];

Then, your sscanf should be:

sscanf(line, "%9c %[^\t] %26c %5s",
            c[i].id_contract,
            c[i].name,
            c[i].id_local,
            c[i].power) != 0)
        {

because the name is length-variable.

Try it now ;)

steformicola
  • 171
  • 7
  • Be aware that `%9c` does not null terminate the input, and will read 9 characters regardless, even if the ID number is only 8 or 7 bytes long. You may get away without null termination if the structure is all null bytes before you read the data into it, but be cautious. – Jonathan Leffler Feb 03 '18 at 23:10
0

What you could do is use read() fonction to read every byte, and check if the byte you are reading equals the ASCII value of \t which is the char for a tabulation.

char byte;
int file_descriptor= open("filename.txt", O_RDNLY, 0555);
while(1){
 int test = read(file_descriptor, &byte, 1);
 if(test == -1) {
   perror("Read");
   exit(errno);//errno is any value but you can import errno from errno.h
 }
 if(test == 0) break;
 /*add the byte to the specific field of your struct*/
}

This is a good solution to me because you are reading your file byte per byte, and each byte will correspond to a char. Dont forger to check when you read a \n which mean you're done reading the line you are reading.

Nark
  • 454
  • 1
  • 7
  • 18
  • 1
    While this technique works, reading a single character at a time is more costly than using the standard I/O library which uses buffers to read multiple characters at a time when possible, and doles out those characters as needed. Even `getc()` or `getchar()` uses that buffering. On small files (up to a few kilobytes), it doesn't matter very much. On big files (megabytes and bigger), it is a big difference. – Jonathan Leffler Feb 03 '18 at 23:12
0

Ok guys it's working now. Credits to @WeatherVane and @StefanoFormicola. This is what I've changed

char line[120];
        while (fgets(line, sizeof line, file) && i < 20) /*This 20 is just to read 20 lines*/
        {

            if (sscanf(line, "%9c %[^\t] %26c %6c",
                c[i].id_contract,
                c[i].name,
                c[i].id_local,
                c[i].power) != 0)
            {
                printf("Contract ID = %s\n", c[i].id_contract);
                printf("Name = %s\n", c[i].name);
                printf("Local ID = %s\n", c[i].id_local);
                printf("Power = %s\n", c[i].power);
                ++i;
            }

This is now my output:

Contract ID = 609140307
Name = Carla Aguiar Cunha Paredes Pires
Local ID = PT 309 181 020 533 713 02F
Power = 13.8

Contract ID = 814991297
Name = Ricardo Andrade Nogueira Matos
Local ID = PT 099 597 635 807 514 05D
Power = 10.35

Contract ID = 843818099
Name = Eduardo Carneiro Paredes Clementino Castro
Local ID = PT 829 961 009 571 587 02D
Power = 5.75

Thank you guys so much!

  • Be aware that `%9c` does not null terminate the input, and will read 9 characters regardless, even if the ID number is only 8 or 7 bytes long. You may get away without null termination if the structure is all null bytes before you read the data into it, but be cautious. – Jonathan Leffler Feb 03 '18 at 21:19