32

What advantages does QString offer over std::string? Can a normal std::string store unicode characters ? I m trying to write a program that will open various song files . The name of these files can be in different languages. I have heard that using a normal string in such cases will not work properly. I want to keep the application independent of QT and reuse the code for example in android . What do you suggest .

Vihaan Verma
  • 12,815
  • 19
  • 97
  • 126
  • 4
    http://www.joelonsoftware.com/articles/Unicode.html – Arafangion Jun 14 '12 at 08:01
  • This may help to answer part of you question http://stackoverflow.com/questions/6028093/unicode-stdstring-class-replacement – kenny Jun 14 '12 at 08:02
  • @Arafangion: that's half of the fun... you should also add all the mess about filesystem encoding to the picture – 6502 Jun 14 '12 at 08:27
  • 5
    Nobody has mentioned that `QString` uses implicit-sharing/copy-on-write mechanisms for a good performance boost as well. – cmannett85 Jun 14 '12 at 08:28

4 Answers4

41

QString allows you to work with Unicode, has more useful methods and integrates with Qt well. It also has better performance, as cbamber85 noted.

std::string just stores the bytes as you give them to it, it doesn't know anything about encodings. It uses one byte per character, but it's not enough for all the languages in the world. The best way to store your texts would probably be UTF-8 encoding, in which characters can take up different number of bytes, and std::string just can't handle that properly! This, for example, means that length method would return the number of bytes, not characters. And this is just the tip of an iceberg...

Community
  • 1
  • 1
Oleh Prypin
  • 33,184
  • 10
  • 89
  • 99
  • I m looking to avoid QT stuff , is there any other way out ? – Vihaan Verma Jun 14 '12 at 08:00
  • 1
    To be fair, std::string also lets you use unicode, although granted, it doesn't mean that the data is in a specific encoding. Do you mean that QString allows unicode conversions? – Arafangion Jun 14 '12 at 08:00
  • 11
    Vihaan, if you are trying to avoid Qt stuff why are you using it? If you have to interface with Qt components then using QString makes your life much easier. – Joey Jun 14 '12 at 08:01
  • 2
    @Arafangion `std::string` just stores the bytes as you give them to it, it doesn't know anything about encodings. (Which, for example, means that `length` is the number of bytes, not characters) – Oleh Prypin Jun 14 '12 at 08:02
  • @Joey I m not using it , just looking for alternate solution. – Vihaan Verma Jun 14 '12 at 08:03
  • @BlaXpirit: That was what I meant, essentially - You're saying that while an std::string might contain data in an encoding such as UTF-8, it has no intrinsic way of telling you that. – Arafangion Jun 14 '12 at 08:03
  • 12
    Note that `QString` has exactly the same problem as std::string: its size too is the number of UTF-16 codepoints, not the number of characters. – MSalters Jun 14 '12 at 08:21
  • 1
    @Joey: the reason I m trying to avoid qt code is to reuse it in android. – Vihaan Verma Jun 14 '12 at 13:15
  • @VihaanVerma FWIW, there is a Qt Android port...haven't tried it myself, but possibly easier to use than C++ in the NDK: http://labs.qt.nokia.com/2011/02/28/necessitas/ – HostileFork says dont trust SE Jun 14 '12 at 15:10
  • 2
    @vy32: `std::wstring` is even worse. You don't know whether it's 2 or 4 bytes per character, and it isn't guaranteed to be Unicode anyways. At least with `std::string`, you know it's 1 byte per char, and you wouldn't always have per-codepoint access with wstring *anyways* (like in Windows where `wchar_t` is 2-byte). – Tim Čas Mar 28 '14 at 01:04
  • I like to think in the UTF-8 world: See http://www.utf8everywhere.org/. Because ICU is UTF-16, it makes sense that `QString` would be UTF-16 based. But, I prefer envisioning `std::string`s to be UTF-8 and allowing `QString` to convert, as necessary. – Dan Nissenbaum Jul 05 '15 at 13:11
  • @OlehPrypin isn't `QString` stored in UTF-16, so your `length()` criticism still holds for cases where a character requires four bytes? – Ben Sep 17 '21 at 18:38
11

If you are using Qt framework for writing your software then you are better off using QString. The simple reason being that almost all Qt functions that work on strings will accept QString. If you use std::string you might end up in situations where you will have to typecase from one to another. You should consider using std::string if your software is not using Qt at all. That will make your code more portable and not dependent on an extra framework that users will have to install to use your software.

Pankaj
  • 599
  • 3
  • 10
  • 5
    Using a framework shouldn't be a reason to let the framework library permeate throughout the code. Converting at the ui boundary are a small price to pay to prevent vendor lock in. – daramarak Jun 14 '12 at 12:11
  • 5
    @daramarak You are right but then Qt is not just a UI framework. It provides all kind of non UI-stuff also like Networking, SQL, XML, Scripting etc. So depending on how extensively one is using the framework, one can take a call. – Pankaj Jun 14 '12 at 15:28
  • Just be sure you will _never_ want to unpick the business logic from the UI and reuse it in some other non-Qt application. For then there will be much wailing and gnashing of teeth. – crobar Sep 24 '19 at 10:32
9

I'm not familiar with QString, but the big advantage of std::string is that it is standard. With regards to Unicode, there's no problem storing UTF-8 in an std::string; depending on what you're doing, however, it might be better to use a std::wstring (which typically will store either UTF-16 or UTF-32).

For complex manipulations of Unicode, I would suggest ICU. But for a lot of applications, just storing UTF-8 is sufficient.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • 1
    `QString` is a reasonable intermediate. It doesn't offer full ICU functionality, but it can do simple stuff like conversions from/to UTF-8/16/32 – MSalters Jun 14 '12 at 09:18
  • 1
    +1 Sticking to the standard is always an advantage. Save the framework spesific types to uses of the framework. I was just wondering, what advantages would a UTF-16 or UTF-32 string give in the case of unicode vs the std::string? – daramarak Jun 14 '12 at 09:24
  • @daramarak It depends on what your doing with strings in the code. `std::string` really isn't much more than a simple container, with a few extra functions for replacing some of the contained _bytes_. It has no knowledge of characters or encodings, much less of anything related to text. If you're manipulating text, it could be that a class which represents text would be more appropriate. Or, alternatively, a library which treats `std::string` as UTF-8 text. – James Kanze Jun 14 '12 at 12:31
  • @MSalters as of Qt5, it now fully incorporates ICU with all of its goodies. And, daramarak One would argue that QString behavior is much more consistent across platforms over std::string which leaves a lot for the implementation to decide. – Wiz Jan 26 '13 at 16:13
4

Technically, there is a standard string class to store any type of character: std::basic_string. std::string and std::wstring are nothing but specializations of std::basic_string for char and wchar. There are also the specializations std::u16string and std::u32string that are meant for UTF-16 and UTF-32 storage.

Anyway, if you have to work with Qt, QString will probably always be a better alternative than any standard library string since the whole Qt framework is designed to work with it.

Morwenn
  • 21,684
  • 12
  • 93
  • 152
  • 1
    Now that we have `std::u16string`, `std::u16string_view`, and `QStringView`, should we be able to write our APIs to generally support any of the three and pass in `{s.data(), s.size()}`? – Ben Nov 03 '20 at 02:25