4

I have two question :

  • I have a data on binary file. I want read first 8 bytes to signed long int by using read function but I could not . Do you know how can I do that ?

  • How can directly read a block of data to string ? Can I read like as shown in ex :

     ifstream is;
     is.open ("test.txt", ios::binary );
    
     string str ;
     is. read ( str.c_str, 40 ) ; // 40 bytes should be read
    

3 Answers3

3

I want read first 8 bytes to signed long int by using read function but I could not . Do you know how can I do that?

Don't assume long is wide enough, it often isn't. long long is guaranteed to be at least 8 bytes wide, though:

long long x;
is.read(static_cast<char *>(&x), 8);

Mind you, this is still incredibly non-portable due to varying integer sizes and endiannesses.

As for your second question, try

char buf[41];
is.read(buf, 40);
// check for errors
buf[40] = '\0';

std::string str(buf);

or, safer

char buf[41];
is.get(buf, sizeof(buf), '\0');
std::string str(buf);
Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • If I try to read binary data until seeing a delimiter, how should I read ? For this time, I have no fixed value as 40. –  Nov 05 '11 at 11:33
  • @fatai: use `get` to read up to a delimiter. Or read a `char` at a time with `>>` and append everything you read to an `std::string` until you hit the delimiter. – Fred Foo Nov 05 '11 at 11:35
  • Well lets not assume anything and use sizeof(x). Or safer yet. `std::getline(is, str, '\0');` – Martin York Nov 05 '11 at 11:37
  • Also, if you're on a 32-bit machine, it's quite plausible that long will be as wide as int, i.e., 32 bits. Long long might be safer, as it is (on my x64 machine) no wider than long (64 bits). – Martin Törnwall Nov 05 '11 at 11:37
  • @larsmans, for second way. Did you consider big-little endian concept ? Is second way work for both type of machine ? –  Nov 05 '11 at 11:38
  • @LokiAstari: when using `sizeof`, the program might read less than 8 bytes, so the behavior is incorrect. The OP should really rethink their requirements. Changed the example to use `long long`, though. – Fred Foo Nov 05 '11 at 11:41
  • @fatai: endianness doesn't apply to strings. – Fred Foo Nov 05 '11 at 11:42
  • "Mind you, this is still incredibly non-portable due to varying integer sizes and endiannesses." Are there any safe way reading byte to "long long" without considering endianness concept. –  Nov 05 '11 at 11:48
  • Well the code should then contain: `assert(sizeof(x) == 8)` If this is not true the code will not work anyway. But reading less and getting bad input is better than reading too much and trashing memory (sort off). – Martin York Nov 05 '11 at 11:51
  • @fatai: no. You should consider the endianness of binary data if you want to write portable software. – Fred Foo Nov 05 '11 at 11:58
1

I'm sure you mean 8 bytes into a 64-bit integer instead and there's a variety of ways to accomplish this. One way is to use a union:

union char_long {
  char chars[8];
  uint64_t n;
};

// Extract 8 bytes and combine into a 64-bit number by using the
// internals of the union structure.
char_long rand_num;  
for(int i = 0; i < 8; i++) {
  rand_num.chars[i] = in.get(); // `in` is the istream.
}    

Now rand_num.n will have the integer stored so you can access it.

As for the second question. Read in the bytes and assign them to the string:

const int len = 5; // Some amount.
char *buf = new char[len];
ifstream in("/path/to/file", ios::binary);
in.read(buf, len);
string str;
str.assign(buf);
delete[] buf;
Morten Kristensen
  • 7,412
  • 4
  • 32
  • 52
  • 2
    Provided that the integer in the file has the same endianness as the machine. Otherwise you'll get garbage. If you don't care about byte ordering (i.e., you know it to be the same as that of the machine), why not simply pass a pointer to a 64-bit integer directly to istream::read()? – Martin Törnwall Nov 05 '11 at 11:27
0

You could be concerned by portability of the code and the data: if you exchange binary files between various machines, the binary data will be read as garbage (e.g. because of endianness and word sizes differences). If you only read binary data on the same machine that has written it, it is ok.

Another concern, especially when the data is huge and/or costly, is robustness with respect to evolution of your code base. For instance, if you read a binary structure, and if you had to change the type of one of its fields from int (or int32_t) to long (or int64_t) your binary data file is useless (unless you code specific conversion routines). If the binary file was costly to produce (e.g. needs an experimental device, or a costly computation, to create it) you can be in trouble.

This is why structured textual formats (which are not a silver bullet, but are helpful) or data base management systems are used. Structured textual formats include XML (which is quite complex), Json (which is very simple), and Yaml (complexity & power between those of XML and Json). And textual formats are easier to debug (you could look at them in an editor). There exist several free libraries to deal with these data formats. Data bases are often more or less relational and Sql based. There are several free DBMS software (e.g. PostGresQL or MySQL).

Regards portability of binary files (between various machines) you could be interested by serialization techniques, formats (XDR, Asn1) and libraries (like e.g. S11n and others).

If space or bandwidth is a concern, you might also consider compressing your textual data.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547