2

I would like to write a program in C that gets the file content via stdin and reads it line by line and, for each line, converts it to an array of 8-bit integer values.

I also would like to be able to do the reverse process. After working with my array of 8-bit values, I would like to convert it again to "lines" that would be organized as a new buffer.

So basically, I would like to convert a char * line to an int array[] and back (an int array[] to a char * line) while keeping the consistency, so when I create the file again out of the conversions, the file is valid (and by valid I mean, the conversion from int array[] to char * line generates the same content of the original char * line, while reading each line of the stdin.

My code is currently as follows:

#include <stdio.h>
#include <stdlib.h>

int main() {
    FILE *stream;
    char *line = NULL;
    size_t len = 0;
    ssize_t read;

    stream = stdin;
    if (stream == NULL)
        exit(EXIT_FAILURE);

    while ((read = getline(&line, &len, stream)) != -1) {
        char * array = line_to_array(line);
        // here I include the rest of my code
        // where I am going to use the generated array
        // ...
    }

    free(line);
    fclose(stream);
    exit(EXIT_SUCCESS);
}

The line_to_array function would be the one to convert the "line" content to the array of integers. In a second file, I would just do the opposite.

The mechanics of the process would be like this:

The first program (first.c) would receive a file content via stdin. By reading it using getline, I would have each line to convert to an array of integers and send each line to a second program (second.c) that would convert each array to a char * buffer again and the reconstruct the file.

In the terminal, I would run it like this:

./first | ./second

I appreciate any help on this matter.

Thank you.

DWS
  • 215
  • 2
  • 10
  • What is the difference between a line of ASCII text in a buffer and an array of 8-bit values, (other than maybe a terminating char)? – Martin James Jul 17 '17 at 18:45
  • @martin-james I need to manipulate the data as numbers since I am going to turn the numbers into the input of some equations. I want to mathematically operate on them. That's why I can't work with ASCII text. Should I cast each char to an int? I just wanted to see if there is a preferred method to do so. – DWS Jul 17 '17 at 19:51
  • It would be helpful if an example of a line of input were provided. Do all lines contain the same number of entries? Are line entries decimal digits? Non-negative? Characters of the Latin Alphabet? etc.... – ad absurdum Jul 17 '17 at 20:14
  • @DavidBowling The binary data can be represented in many ways (binary string, ASCII text, array of integers, etc). Each line will contain the number of values depending on the original data, so it will probably different for every line. As a dummy example, imagine that I have one line 'buffer' that looks like this: `"\r}\"\xEE"` . In Ruby, I can "unpack" this line an get the following array: `array = line.unpack("C*")` which gives me the output: `[13, 125, 34, 238]` . As I said before, I could cast the char * line to integer but I don't know if there is a better way. – DWS Jul 17 '17 at 20:18
  • 1
    You should know that C places few restrictions on the encoding used for the execution character set; it need not be ASCII or UTF-8 (likely, though). A `char` is an arithmetic type, and you can do mathematical operations on these values; no need to cast. You may want to store results in a wider type, such as `int`, to avoid overflow issues (depending on the nature of the mathematical manipulations). – ad absurdum Jul 17 '17 at 20:37
  • Yes @DavidBowling , I will need to store it in an int type due to the mathematical manipulations. But It got what you said and I will try to operate on the char without casting for testing purposes. Thank you. – DWS Jul 17 '17 at 20:47
  • 1
    As far as I can tell, what you propose is fine. In an expression, `char` values will be converted to `int` implicitly. It may ease your mind to read about [the integer promotions](http://port70.net/~nsz/c/c11/n1570.html#6.3.1.1) and [the usual arithmetic conversions](http://port70.net/~nsz/c/c11/n1570.html#6.3.1.8). – ad absurdum Jul 17 '17 at 21:20
  • `line` is already an array of `char` that is a signed 8-bit integer. – Havenard Jul 18 '17 at 19:09

1 Answers1

1

I believe you may already know that a name of array is a kind of constant pointer. You could verify the fact from following code:

char hello[] = "hello world!";

for( int idx=0; *(hello + idx) != 0; idx++ )
{
    printf("%c", *(hello + idx));
}
printf("\n");

So, there are no reason to convert character pointer to array. For your information, A char variable is a 8bit data in C, this can contain a integer value which is represent a character: 65 represent 'A' in ASCII code.

Secondly, this link may help you to understand how to convert between c string and std::string.

On second thought, may your input file is UNICODE or UTF-8 encoded file which is using multi-byte character code. In that case, you may not able to use getline() to read the string from the file. If so, please refer this question: Reading unicode characters.


I wish following code assist you to understand char type, array and pointer in C/C++:

std::string hello("Hello world");
const char *ptr = hello.c_str();

for( int idx=0; idx < hello.size(); idx++ )
{
    printf("%3d ", *(ptr + idx));
}
printf("\n");

std::string hello("Hello world");
const char *ptr = hello.c_str();

for( int idx=0; idx < hello.size(); idx++ )
{
    printf("%3d ", ptr[idx]);
}
printf("\n");
Joohae Kim
  • 41
  • 4
  • @DavidBowling Thank you for your commenting. I wasn't able to explaining the relationship between pointer and name of array. So, tell 'a kind of lie'. :D – Joohae Kim Jul 17 '17 at 19:44
  • Hi @joohae-kim , thank you for your answer. The only reason I am looking for numbers is that I need to mathematically operate on them. I am creating a way of transform data from files in a mathematical way. So I need to work with numbers. In Ruby, I am able to do that with the method `unpack` for String and `pack` for Array. My ruby code does the following: `array = line.unpack("C*")` and to revert the process I do: `line = array.pack("C*")` . Basically, I am working with the same data but in different formats. I just wanted to check if there is any recommended way to do this in C. – DWS Jul 17 '17 at 20:02
  • @DWS, in C/C++, null terminated string is a sequence of char, and char is 8 bit integer. so no need to convert between them. – Joohae Kim Jul 18 '17 at 19:12
  • @DWS, I think you could assume this: `const char* array = line.c_str();` may similar to `array = line.unpack("C*")` and `std::string line(array)` also similar to `line = array.pack("C*")`. it's very long time since I was using Ruby , I'm not 100% sure. I wish additional answer and comment help you. – Joohae Kim Jul 18 '17 at 19:16
  • @DWS, can I see a line of input data? I think we are not in the same page. – Joohae Kim Jul 18 '17 at 19:20
  • The input data comes from a video stream, which I receive via stdin and I read it line by line. Each line is an array of char and I mathematically operate on those generating numbers below 32 bits. Placing the binary data here wouldn't be as good to read as I would like. But with the information I gave you now, I believe you can have a better idea. – DWS Jul 18 '17 at 19:52