-4

I was wondering if there was a C++ equivalent to Javas .getBytes() method. I'm reading a .txt file and need to convert each line into bytes.

Thanks in advance!

2 Answers2

0

In C++ a char is a byte. And so a std::string is already a sequence of bytes.

However, you may want a sequence of unsigned char.

One way is to just copy the byte values from the string, e.g. into a std::vector:

using Byte = unsigned char;
vector<Byte> const bytes( s.begin(), s.end() );

If you're reading the text file into a std::wstring per line, e.g. using a wide stream, then the bytes depend on your preferred encoding of that string.

In practice, except possibly on an IBM mainframe, a C++ wide string is either UTF-16 or UTF-32 encoded. For these two encodings the standard library provides specializations of std::codecvt that can convert to and from UTF-8.


If you want an arbitrary encoding from a wide string, then you're out of luck as far as the C++ standard library is concerned, sorry.

Cheers and hth. - Alf
  • 142,714
  • 15
  • 209
  • 331
  • A Java java.lang.String also wraps an array of characters just like std::string, there is no difference in that regard. – Sean F Oct 07 '16 at 20:49
  • @SeanF: A Java `String` is Unicode encoded as UTF-16 internally. A C++ `std::string` is a sequence of bytes. That's a very big difference. – Cheers and hth. - Alf Oct 07 '16 at 20:55
  • @ Cheers and hth. In C++ you can encode strings whichever way you like. If you want a UTF-16 std::string, there is nothing stopping you. – Sean F Oct 07 '16 at 20:57
  • No it isn't, and I don't consider that much of an argument. http://stackoverflow.com/questions/11086183/encode-decode-stdstring-to-utf-16 – Sean F Oct 07 '16 at 20:59
  • You're not making sense. Nobody's said anything about restriction to ascii. And you can put a picture in a std::string if you want. Talking about that in this context is however meaningless, dumb, nonsense, just not relevant at all except on maybe a purely associative level. – Cheers and hth. - Alf Oct 07 '16 at 21:07
0

std::string::data is the equivalent.

Sean F
  • 4,344
  • 16
  • 30
  • Are you saying that Java `.getBytes()` produces a raw pointer? Is that what it does? – Cheers and hth. - Alf Oct 07 '16 at 20:49
  • There are no pointers in Java. getBytes just gives you direct access to the array of characters, just like std::string::data does, and in both cases you're looking at an array of bytes. In neither language is it advisable that you use these arrays rather than the original string objects. – Sean F Oct 07 '16 at 20:52
  • Java is full of pointers. That's why the Java language specification, which you should be familiar with in order to answer questions concerning Java, calls them pointers. However, Java does not have raw pointers, and that was my point here: saying that `std::string::data` is an equivalent to anything in Java, is meaningless. – Cheers and hth. - Alf Oct 07 '16 at 20:57
  • I wrote a JVM, so I am familiar with the JVM spec. I also wrote java compilers and decompilers, escape analysis tools for Java, java bytecode compression, code for Java JIT compilation, papers for real-time Java (JSR-1), Java garbage collectors, and much much more. I have forgotten more about Java than you have ever known about the language. There are no pointers in Java. Get over it. There are pointers in whatever languages you use to write the JVM. – Sean F Oct 07 '16 at 21:08
  • The Java language specification calls Java pointers, pointers. You are saying that you are not familiar with the spec, but that you view yourself as very competent. You're also maintaining that you've done a lot of Java programming but that you're unfamiliar with e.g. `NullPointerException`. – Cheers and hth. - Alf Oct 07 '16 at 21:20
  • I know the spec from back to front. https://docs.oracle.com/javase/specs/jvms/se8/html/index.html Point me to where it uses the word pointers. Yeah, I thought so, liar. – Sean F Oct 07 '16 at 21:22
  • Feel free to point to the use of the term Java pointers in the language spec, liar. https://docs.oracle.com/javase/specs/ – Sean F Oct 07 '16 at 21:25
  • Java language spec 1.0 page 1 "NullPointerException"; page 38 "The reference values (often just references) are pointers "; page 110 "Abbreviations, as in buf holding a pointer to a buffer of some kind"; etc. You don't know your stuff, SeanF. – Cheers and hth. - Alf Oct 07 '16 at 21:26
  • I had no idea there were such nitwits on this site. You point me to a different word, the name of an Exception that has existed since Java 1. And you point to the word pointer appearing twice (out of 800 pages, yes, twice in 800), in a different context, in one of those places it explains that Java uses references, not pointers. This is to prove your point that "the Java language specification uses the term pointers". You have just posted the evidence that you are a liar. Anyway, I shouldn't waste any more time on you, so I will not. – Sean F Oct 07 '16 at 21:29
  • You're saying that re terminology, the spec can be ignored. After all it doesn't repeat itself on and on, as you point out. – Cheers and hth. - Alf Oct 07 '16 at 21:35