9

I have just done what appears to be a common newbie mistake:

First we read one of many tutorials that goes like this:

 #include <fstream>
 int main() {
      using namespace std;
      ifstream inf("file.txt");
      // (...)
 }  

Secondly, we try to use something similar in our code, which goes something like this:

#include <fstream>
int main() {
    using namespace std;
    std::string file = "file.txt"; // Or get the name of the file 
                                   // from a function that returns std::string.
    ifstream inf(file);
    // (...)
}

Thirdly, the newbie developer is perplexed by some cryptic compiler error message.

The problem is that ifstream takes const * char as a constructor argument.

The solution is to convert std::string to const * char.

Now, the real problem is that, for a newbie, "file.txt" or similar examples given in almost all the tutorials very much looks like a std::string.

So, is "my text" a std::string, a c-string or a *char, or does it depend on the context?

Can you provide examples on how "my text" would be interpreted differently according to context?

[Edit: I thought the example above would have made it obvious, but I should have been more explicit nonetheless: what I mean is the type of any string enclosed within double quotes, i.e. "myfilename.txt", not the meaning of the word 'string'.]

Thanks.

Community
  • 1
  • 1
augustin
  • 14,373
  • 13
  • 66
  • 79
  • 3
    For more newbie C++ confusion, the `ifstream` constructor takes a `const char*` (a pointer to const `char`), not a `char* const` (a const pointer to `char`). It's terrible, isn't it? – James McNellis Aug 21 '10 at 01:13
  • The STL library which comes with VS2010 seems to have an interface 'ifstream::ifstream(const string&,....). The OP code worked perfectly fine – Chubsdad Aug 21 '10 at 01:37
  • 4
    @chubsdad: That constructor has been added in C++0x, which is still only a draft, but which is partially supported by many implementations. – James McNellis Aug 21 '10 at 01:40

8 Answers8

10

So, is "string" a std::string, a c-string or a *char, or does it depend on the context?

  • Neither C nor C++ have a built-in string data type, so any double-quoted strings in your code are essentially const char * (or const char [] to be exact). "C string" usually refers to this, specifically a character array with a null terminator.
  • In C++, std::string is a convenience class that wraps a raw string into an object. By using this, you can avoid having to do (messy) pointer arithmetic and memory reallocations by yourself.
  • Most standard library functions still take only char * (or const char *) parameters.
  • You can implicitly convert a char * into std::string because the latter has a constructor to do that.
  • You must explicitly convert a std::string into a const char * by using the c_str() method.

Thanks to Clark Gaebel for pointing out constness, and jalf and GMan for mentioning that it is actually an array.

casablanca
  • 69,683
  • 7
  • 133
  • 150
  • 2
    Double-quoted strings aren't `char*`, they're `const char*`. Also, you can't convert a `std::string` into a `char*` with `c_str()`, that returns a `const char*` yet again. Sure const-correctness is hard, but that doesn't mean you can ignore it entirely. – Clark Gaebel Aug 21 '10 at 02:55
  • 2
    Actually, a a string literal is a const char *array*, not a pointer. :) – jalf Aug 21 '10 at 04:08
  • @jalf: It can be both depending on how you look at it, but it's definitely at least a pointer. Every array evaluates to a pointer, and an array has no individual existence except for the fact that an array declaration statically allocates storage. – casablanca Aug 21 '10 at 04:30
  • 1
    @casablanca: It's not "at least" a pointer, what does that even mean? Types aren't ranked. Arrays do not "evaluate" to pointers, arrays are arrays and pointers are pointers. Arrays can be implicitly converted to pointers. An array does not statically allocate storage, unless you mean static storage to mean statically sized. (But note static storage has a specific meaning in C++.) And... @Clark: A string literal is an array of const char, with static storage. – GManNickG Aug 21 '10 at 06:32
  • @GMan: I guess I was wrong about the "evaluates to" part, I was only trying to say that since you don't have direct access to the array declaration, it behaves as a pointer for most practical purposes, except for say `sizeof` a string. I've amended the answer anyway. – casablanca Aug 21 '10 at 15:11
7

"myString" is a string literal, and has the type const char[9], an array of 9 constant char. Note that it has enough space for the null terminator. So "Hi" is a const char[3], and so forth.

This is pretty much always true, with no ambiguity. However, whenever necessary, a const char[9] will decay into a const char* that points to its first element. And std::string has an implicit constructor that accepts a const char*. So while it always starts as an array of char, it can become the other types if you need it to.

Note that string literals have the unique property that const char[N] can also decay into char*, but this behavior is deprecated. If you try to modify the underlying string this way, you end up with undefined behavior. Its just not a good idea.

Dennis Zickefoose
  • 10,791
  • 3
  • 29
  • 38
3
std::string file = "file.txt"; 

The right hand side of the = contains a (raw) string literal (i.a. a null-terminated byte string). Its effective type is array of const char.

The = is a tricky pony here: No assignment happens. The std::string class has a constructor that takes a pointer to char as an argument and this is called to create a temporary std::string and this is used to copy-construct (using the copy ctor of std::string) the object file of type std::string.

The compiler is free to elide the copy ctor and directly instantiate file though.

However, note that std:string is not the same thing as a C-style null-terminated string. It is not even required to be null-terminated.

ifstream inf("file.txt");

The std::ifstream class has a ctor that takes a const char * and the string literal passed to it decays to a pointer to the first element of the string.

The thing to remember is this: std::string provides (almost seamless) conversion from C-style strings. You have to look up the signature of the function to see if you are passing in a const char * or a std::string (the latter because of implicit conversions).

dirkgently
  • 108,024
  • 16
  • 131
  • 187
  • I believe the `std::string` is in fact required to be null-terminated; the implementation bundled with GCC 4.4.1 reads "due to 21.3.4 must be kept null-terminated", and as far as I know that comment has been there for quite some time. – Jon Purdy Aug 21 '10 at 01:25
  • Not entirely correct; it's an initializer, not an assignment operator. There's no way to parse `T a = b` such that it's assignment; `(T a) = b` doesn't work because (T a) isn't an expression, `T (a=b)` doesn't work because (a=b) isn't an identifier. – tc. Aug 21 '10 at 01:29
  • @tc: See second paragraph. I am not sure how else I could have worded `=`. – dirkgently Aug 21 '10 at 01:31
  • 1
    According to a slightly dubious copy (http://www.kuzbass.ru:8086/docs/isocpp/lib-strings.html), 21.3.4 requires that (const std::string&)(mystring)[mystring.size()] returns 0 (or rather, charT()); this can be special-cased (it's not required for the non-const operator[]). The data() function is not required to return a null-terminated buffer, and c_str() is not required to be O(1). For convenience, though, GNU libstdc++ just uses a null-terminated buffer. IIRC GCC's memory layout of std::string is also equivalent to char*, which is a neat hack. – tc. Aug 21 '10 at 01:41
  • @dirkgently: I didn't downvote. I was just mentioning something I remembered, then checked against the GCC implementation of the STL. The referenced section apparently pertains to "`basic_string` element access", so it might be implying that `basic_string[s.size()]` should be `T()`, though I'm not at all certain. – Jon Purdy Aug 21 '10 at 01:43
  • @dirkgently: The relevant section reads: "`const_reference operator[](size_type pos) const; reference operator[](size_type pos);` Returns: If `pos < size()`, returns `data()[pos]`. Otherwise, if `pos == size()`, the `const` version returns `charT()`. Otherwise, the behavior is undefined." So it's not required to be null-terminated, but a *reasonable* implementation is likely to be. – Jon Purdy Aug 21 '10 at 01:47
  • [Fixing comment:] Note: It's only the result of `c_str()` that is guaranteed to have a terminating null added at offset `size()` – dirkgently Aug 21 '10 at 02:02
  • @tc: FWIW, C++0X also mandates that the elements of `std::string` be contiguous (since most implementations already do that). – dirkgently Aug 21 '10 at 02:03
  • And that it be null terminated; `data()` and `c_str()` are now just different names for the same function. – Dennis Zickefoose Aug 21 '10 at 03:07
2

So, is "string" a std::string, a c-string or a char*, or does it depend on the context?

It depends entirely on the context. :-) Welcome to C++.

A C string is a null-terminated string, which is almost always the same thing as a char*.

Depending on the platforms and frameworks you are using, there might be even more meanings of the word "string" (for example, it is also used to refer to QString in Qt or CString in MFC).

James McNellis
  • 348,265
  • 75
  • 913
  • 977
1

Neither C nor C++ have a built-in string data type.

When the compiler finds, during the compilation, a double-quoted strings is implicitly referred (see the code below), the string itself is stored in program code/text and generates code to create even character array:

  • The array is created in static storage because it must persist to be referred later.
  • The array is made to constant because it must always contain the original data (Hello).

So at last, what you have is const char * to this constant static character array.

const char* v()
{
    char* text = “Hello”;
    return text;
    // Above code can be reduced to:
    // return “Hello”;
}

During the program run, when the control finds opening bracket, it creates “text”, the char* pointer, in the stack and constant array of 6 elements (including the null terminator ‘\0’ at the end) in static memory area. When control finds next line (char* text = “Hello”;), the starting address of the 6 element array is assigned to “text”. In next line (return text;), it returns “text”. With the closing bracket “text” will disappear from the stack, but array is still in the static memory area.

You need not to make return type const. But if you try to change the value in static array using non constant char* it will still give you an error during the run time because the array is constant. So, it’s always good to make return constant to make sure, it cannot be referred by non constant pointer.

But if the compiler finds a double-quoted strings is explicitly referred as an array, the compiler assumes that the programmer is going to (smartly) handle it. See the following wrong example:

const char* v()
{
    char text[] = “Hello”;
    return text;
}

During the compilation, compiler checks, quoted text and save it as it is in the code to fill the generated array during the runt time. Also, it calculate the array size, in this case again as 6.

During the program run, with the open bracket, the array “text[]” with 6 elements is created in stack. But no initialization. When the code finds (char text[] = “Hello”;), the array is initialized (with the text in compiled code). So array is now on the stack. When the compiler finds (return text;), it returns the starting address of the array “text”. When the compiler find the closing bracket, the array disappears from the stack. So no way to refer it by the return pointer.

Most standard library functions still take only char * (or const char *) parameters.

The Standard C++ library has a powerful class called string for manipulating text. The internal data structure for string is character arrays. The Standard C++ string class is designed to take care of (and hide) all the low-level manipulations of character arrays that were previously required of the C programmer. Note that std::string is a class:

  • You can implicitly convert a char * into std::string because the latter has a constructor to do that.
  • You can explicitly convert a std::string into a const char * by using the c_str() method.
Worked Whe
  • 21
  • 4
1

The C++ standard library provides a std::string class to manage and represent character sequences. It encapsulates the memory management and is most of the time implemented as a C-string; but that is an implementation detail. It also provides manipulation routines for common tasks.

The std::string type will always be that (it doesn't have a conversion operator to char* for example, that's why you have the c_str() method), but it can be initialized or assigned to by a C-string (char*).

On the other hand, if you have a function that takes a std::string or a const std::string& as a parameter, you can pass a c-string (char*) to that function and the compiler will construct a std::string in-place for you. That would be a differing interpretation according to context as you put it.

David
  • 9,635
  • 5
  • 62
  • 68
0

As often as possible it should mean std::string (or an alternative such as wxString, QString, etc., if you're using a framework that supplies such. Sometimes you have no real choice but to use a NUL-terminated byte sequence, but you generally want to avoid it when possible.

Ultimately, there simply is no clear, unambiguous terminology. Such is life.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • Well, there actually is clear terminology if you use the full name. I.e. "Standard string" or "double you ex string" or "queue string" or "null terminated character array".... – Billy ONeal Aug 21 '10 at 01:59
0

To use the proper wording (as found in the C++ language standard) string is one of the varieties of std::basic_string (including std::string) from chapter 21.3 "String classes" (as in C++0x N3092), while the argument of ifstream's constructor is NTBS (Null-terminated byte sequence)

To quote, C++0x N3092 27.9.1.4/2.

basic_filebuf* open(const char* s, ios_base::openmode mode);

...

opens a file, if possible, whose name is the NTBS s

Cubbi
  • 46,567
  • 13
  • 103
  • 169