0

After checking for an answer here, and running it in my environment I found that the code still runs into the same issues as my code. This issue is that whenever I have an input file similar to this...

 FILE A                 
|---------------------| 
| ABCDE               | 
| abcde               | 
|---------------------| 

I get extra newline white space generated in my destination file.

 FILE B                 
|---------------------| 
| edcba               | 
|                     | 
| EDCBA               | 
|---------------------| 

After debugging my code I could see that '\n' was being copied into the destination file twice, I'd like to understand why lseek is doing this.

Attached is my code. The critical section in question is in the do/while loop. I'm pretty sure my thought process is sound, as the code from this answer that I looked up would output the exact same results.

#define CHAR_SIZE 2048 
#define BUFF_SIZE 1    
#define PERMS 0666   


int main(int argc, char* argv[]){                                          
    if (argc < 3) {                                                        
        return 1;                                                          
        printf ("ERROR: not enough arguments\n");                          
    }                                                                      
    int src_file = 0; int dest_file = 0;                                   
    int n = -1;                                                            
    if((src_file=open(argv[1],O_RDONLY)) < -1) return 1;                   
    if((dest_file = creat(argv[2], PERMS)) < -1) return 1;                 
    printf ("The filesize is: %d\n", lseek(src_file, (off_t) 0, SEEK_END));
    char buffer[BUFF_SIZE];                                                
    lseek (src_file,n--,SEEK_END);                                         
    do {                                                                   
        read(src_file,buffer,BUFF_SIZE);                                   
        printf ("%s", buffer);                                             
        write(dest_file,buffer,BUFF_SIZE);                                 
    }while (lseek (src_file,n--,SEEK_END) > -1);                           
    printf("\n");                                                          
    return 0;                                                              

}                       
Community
  • 1
  • 1
tisaconundrum
  • 2,156
  • 2
  • 22
  • 37
  • 1
    My suggestion would be to print the characters using `printf("%02x\n", buffer);`. You can't use `"%s"` because the buffer is only one character, so it doesn't have a NUL terminator, and isn't a valid string. And add the output from that `printf` to the question, so we can see what's in the input file. – user3386109 Mar 23 '16 at 04:35
  • Also a little oddity about printf is it only flushes writes when it meets a "\n" in the format string. If there is anything left to print it does the same again. This coupled with the \n that is after the second string isn't removed by the looks of things – Careful Now Mar 23 '16 at 04:39
  • FILE A is the input file, FILE B is the output file – tisaconundrum Mar 23 '16 at 05:49
  • `printf("%02x\n", buffer);` only gives me an output of todays date. 22feb3, over and over again. – tisaconundrum Mar 23 '16 at 05:50
  • Side note, you definitely shouldn't be doing a syscall for every byte. Either do the buffering yourself, or just use `fopen/fseek` rather than `open/lseek`. – o11c Mar 23 '16 at 05:51
  • @o11c This is a homework assignment and i'm constrained to using syscalls. Can you elaborate on what you mean by "do the buffering yourself"? – tisaconundrum Mar 23 '16 at 05:52
  • instead of `printf("%02x\n", buffer)` do `printf("%c", *buffer)` or `putchar(*buffer)` this will output single characters correctly. – tisaconundrum Mar 23 '16 at 05:58
  • Should be `printf("%02x\n", *buffer);` or `printf("%02x\n", buffer[0]);` The point of the `%02x` is to show the non-printing characters, since that seems to be where the problem is. – user3386109 Mar 23 '16 at 05:59
  • Thank you for clarifying @user3386109 Here is my output `The filesize is: 12` `65 64 63 62 61 0a 0a 45 44 43 42 41 ` – tisaconundrum Mar 23 '16 at 06:15

2 Answers2

1

You have an number of problems. Your extra newline come from the fact that POSIX defines the line-end as a newline (though not all files comply). Therefore to get rid of your extra newline, you need:

int n = -2;

instead of

int n = -1;

(which is explicitly reading the POSIX line-end and writing that to your output file)

Your additional problems are you will not know if your file open fails because you are checking the return incorrectly:

// if ((src_file = open (argv[1],O_RDONLY)) < -1) return 1;
// if ((dest_file = creat (argv[2], PERMS)) < -1) return 1;

The return is -1 on failure. Therefore you need:

if ((src_file = open (argv[1],O_RDONLY)) == -1) return 1;
if ((dest_file = creat (argv[2], PERMS)) == -1) return 1;

Next, your buffer will NOT hold a string. At minimum you need a 2-char array to hold 1-char plus the nul-terminating character. The following will not work:

#define BUFF_SIZE 1
...
char buffer[BUFF_SIZE];
...
    printf ("%s", buffer);

You have sent an non-terminated character to printf. Which if you are compiling with warnings enabled (e.g. -Wall -Wextra) you would know because the compiler would warn you (it probably does even without the proper warning flags depending on your compiler). At a minimum to make the above work you need:

#define BUFF_SIZE 2

and you must insure you nul-terminate buffer before calling printf. You can accomplish that with:

char buffer[BUFF_SIZE] = "";
...
    read (src_file, buffer, BUFF_SIZE - 1);
    buffer[1] = 0;
    printf ("%s", buffer);
    write (dest_file, buffer, BUFF_SIZE - 1);

(you should be checking the return of read and write to validate you are reading/writing the number of bytes you think you are...)

Putting the corrections together (and showing you can eliminate the buffer altogether and simply use an int as a buf char) you could do the following:

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

#define CHAR_SIZE 2048
#define BUFF_SIZE 2
#define PERMS 0666

int main(int argc, char **argv) {

    if (argc < 3) {
        return 1;
        printf ("ERROR: not enough arguments\n");
    }

    int src_file = 0;
    int dest_file = 0;
    int n = -2;
    int nl = '\n';  /* explicit newline when writing in reverse */

    // if ((src_file = open (argv[1],O_RDONLY)) < -1) return 1;
    // if ((dest_file = creat (argv[2], PERMS)) < -1) return 1;
    if ((src_file = open (argv[1],O_RDONLY)) == -1) return 1;
    if ((dest_file = creat (argv[2], PERMS)) == -1) return 1;

    printf ("The filesize is: %ld\n", lseek(src_file, (off_t) 0, SEEK_END));

    lseek (src_file, n--, SEEK_END);
#ifdef WBUFCHAR
    int bufchar;
    do {    /* validate both read and write */
        if (read (src_file, &bufchar, BUFF_SIZE - 1) == 0) {
            fprintf (stderr, "error: read failure.\n");
            return 1;
        }
        putchar (bufchar);
        if (write (dest_file, &bufchar, BUFF_SIZE - 1) == 0) {
            fprintf (stderr, "error: write failure.\n");
            return 1;
        }
    } while (lseek (src_file, n--, SEEK_END) > -1);
#else
    char buffer[BUFF_SIZE];
    do {    /* validate both read and write */
        if (read (src_file, buffer, BUFF_SIZE - 1) == 0) {
            fprintf (stderr, "error: read failure.\n");
            return 1;
        }
        buffer[1] = 0;          /* nul-terminate */
        printf ("%s", buffer);
        if (write (dest_file, buffer, BUFF_SIZE - 1) == 0) {
            fprintf (stderr, "error: write failure.\n");
            return 1;
        }
    } while (lseek (src_file, n--, SEEK_END) > -1);
#endif
    /* explicity write the newline you removed earlier */
    if (write (dest_file, &nl, BUFF_SIZE - 1) == 0) {
        fprintf (stderr, "error: write failure.\n");
        return 1;
    }
    putchar ('\n');
    return 0;
}

note: compile as normal to use your buffer. Compile with -DWBUFCHAR to use the bufchar code.

Example Compilation and Output

$ cat testabcde.txt
ABCDE
abcde

$ gcc -Wall -Wextra -Ofast -o bin/extranl extranl.c

$ ./bin/extranl testabcde.txt testrev.txt
The filesize is: 12
edcba
EDCBA

$ cat testrev.txt
edcba
EDCBA

$ gcc -Wall -Wextra -Ofast -DWBUFCHAR -o bin/extranlwbc extranl.c

$ ./bin/extranlwbc testabcde.txt testrevwbc.txt
The filesize is: 12
edcba
EDCBA

$ cat testrevwbc.txt
edcba
EDCBA

Give that a try and let me know if you have further questions.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
  • Thank you for your answer, this helped me understand what is happening intrinsically for print. however, in my case i can't use a buffer size of 2 because it creates spaces between each of my letters in my `dest_file.txt`. – tisaconundrum Mar 23 '16 at 06:12
  • 1
    Take a look at the full code I posted. Note the second update with the `nl` variable – David C. Rankin Mar 23 '16 at 06:18
  • I'm getting an output of this `dcba\n\nEDCBA` are those extra parameters `gcc -Wall -Wextra -finline-functions...` supposed to fix part of the problems i'm getting with writing to files? – tisaconundrum Mar 23 '16 at 06:25
  • No. I removed the `-finline-functions` that and the `Ofast` are just compiler optimizations. The `-Wall -Wextra` enables compiler *warnings* (which you should use *every* time). What operating system are you on? (please not Windows) Have you tried with *exactly* the code I posted? I have confirmed both in the output above. – David C. Rankin Mar 23 '16 at 06:31
  • MinGW Windows, (Unfortunately) I get the same problem in linux as well. I'm using QTcreator, and it gives me all warnings – tisaconundrum Mar 23 '16 at 06:36
  • What version of `gcc`? if earlier than `4.6` get rid of `-Ofast` and use `-O3` if you want full optimization. This code compiles `without a single warning`. So if you are getting warnings, it isn't with this code (you are probably missing a header or something). What warnings do you get? – David C. Rankin Mar 23 '16 at 06:38
  • 2
    You know, I think I figured out your `dcba\n\nEDCBA`. You are using a data file you created on windows with `ABCDE\r\nabcde\r\n`. When you create a file in say `Notepad` windows inserts `DOS` line-ends with are *carriage-return* then *newline* (`\r\n`), so I suspect that is where your character funniness is coming from. Create the datafile in your `mingw` setup with `printf "ABCDE\nabcde\n" > testabcde.txt` and then run the code again. (there is nothing wrong with `mingw`, it is a fine environment with `gcc` as the compiler -- old, but fine) – David C. Rankin Mar 23 '16 at 06:44
  • Thanks David, this makes sense coming back and rereading your comments. – tisaconundrum Mar 23 '16 at 15:30
0
#include <unistd.h>                                                         
#include <fcntl.h>                                                          
#include <sys/types.h>                                                      

#define CHAR_SIZE 2048                                                      
#define BUFF_SIZE 1                                                         
#define PERMS 0666                                                          

int main(int argc, char* argv[]){                                           
    if (argc < 3) {                                                         
        printf ("ERROR: not enough arguments\n");                           
        return 1;                                                           
    }                                                                       
    int src_file = 0; int dest_file = 0;                                    
    int n = -1;                                                             
    if((src_file=open(argv[1],O_RDONLY)) == -1) return 1;                   
    if((dest_file = creat(argv[2], PERMS)) == -1) return 1;                 
    printf ("The filesize is: %d\n", lseek(src_file, (off_t) 0, SEEK_END)); 
    char buffer[BUFF_SIZE];                                                 
    lseek (src_file,n--,SEEK_END);                                          
    do {                                                                    
        read(src_file,buffer,BUFF_SIZE);                                    
        printf("%02x\n", *buffer);                                          
        if (*buffer == '\n')                                                
            n--;                                                            
        write(dest_file,buffer,BUFF_SIZE);                                  
    }while (lseek (src_file,n--,SEEK_END) > -1);                            
    printf("\n");                                                           
    return 0;                                                               

}                                                                           

This solution was the best way to fix the extra newline white space (\n\n) in my output file.

by adding this part to the do/while loop I can compensate for the extra \n that is added.

if (*buffer == '\n')                                                
                n--;
tisaconundrum
  • 2,156
  • 2
  • 22
  • 37