7

I'm trying to port my code from using MFC's CString to std::string for Microsoft Windows platform. And I'm curious about something. Say in the following example:

CString MakeLowerString(LPCTSTR pStr)
{
    CString strLower = pStr ? pStr : L"";
    CharLower(strLower.GetBuffer());        //Use WinAPI
    strLower.ReleaseBuffer();

    return strLower;
}

I use strLower.GetBuffer() to obtain a writable buffer to be passed to the CharLower API. But I don't see a similar method in std::string.

Am I missing something? And if so, how would you overwrite the method above using std::string?

c00000fd
  • 20,994
  • 29
  • 177
  • 400
  • 1
    You *don't*. If you need to modify the string you modify the string. You only need the "C buffer" if you're passing the string to an old function taking a constant char pointer. – Some programmer dude Jun 21 '15 at 04:31
  • 1
    @JoachimPileborg: OK, let's pretend that `CharLower` API is not `CharLower` but some arbitrary API that will modify its input buffer that I need to take from `std::string`. How would I do that? That's what I'm asking. – c00000fd Jun 21 '15 at 04:33
  • Im telling you that you don't *need* the raw buffer, everything you need is already in the string class or the standard library. Take a look at e.g. http://en.cppreference.com/w/cpp and browse around there for a while. – Some programmer dude Jun 21 '15 at 04:47
  • @JoachimPileborg So what you're implying is, you should always start from scratch instead of updating old code...? – user253751 Jun 21 '15 at 07:05
  • @C xx let's pretend Anything. Use stringbuf instead – Ed Heal Jun 21 '15 at 08:18
  • It wasn't mentioned yet explicitly, but `LPCTSTR` is based on `TCHAR`, which maps to either `char` or `wchar_t`. The closes equivalent would therefore be `std::basic_string` and not `std::basic_string` (a.k.a. `std::string`). Consider using `std::wstring` instead, which often makes more sense in our globalized world and it integrates better with MS Windows, which uses `wchar_t` internally with UTF-16 encoding, too. – Ulrich Eckhardt Jun 21 '15 at 09:14
  • 1
    What I'm saying is that just about all the code you need to modify a `std::string` object already exists, either in the [`std::string`](http://en.cppreference.com/w/cpp/string/basic_string) class itself, or by using other functions in the standard library..You don't have to rewrite or start anything from scratch at all, since the code already is there for you to use. Case conversions, modifying substrings, appending, prepending, insertion, it's all there for you. – Some programmer dude Jun 21 '15 at 09:24
  • @JoachimPileborg: ... except what I need in my current question. (And that's the first thing I dug into.) Another issue that just came up here: http://stackoverflow.com/a/30961251/843732 is Unicode symbols and converting non-English characters to lower/upper case. And the example used there was the mostly recommended for STL on the web. – c00000fd Jun 21 '15 at 10:13

4 Answers4

5

On my new job we do not use MFC - but luckily std lib and C++11 - so I've come up to the same question as c00000fd. Thanks to BitTickler's answer I came up with the idea of using the string's internal buffer for Win32-APIs via the &s[0] resp. &s.front() catch.

Using a Win32-API function that shrinks the internal string buffer

Assuming that you have a string which shall become shortened by a Win32-API function - e.g. ::PathRemoveFileSpec(path) - you may follow this approach:

std::string path( R("?(C:\TESTING\toBeCutOff)?") );
::PathRemoveFileSpec( &path.front() ); // Using the Win32-API
                                       // and the the string's internal buffer
path.resize( strlen( path.data() ) );  // adjust the string's length 
                                       // to the first \0 character
path.shrink_to_fit();                  // optional to adjust the string's
                                       // capacity - useful if you
                                       // do not plan to modify the string again

Unicode Version:

std::wstring path( LR("?(C:\TESTING\toBeCutOff)?") );
::PathRemoveFileSpec( &path.front() ); // Using the Win32-API
                                       // and the the string's internal buffer
path.resize( wcslen( path.data() ) );  // adjust the string's length 
                                       // to the first \0 character
path.shrink_to_fit();                  // optional to adjust the string's
                                       // capacity - useful if you
                                       // do not plan to modify the string again

Using a Win32-API function that extends the internal string buffer

Assuming that you have a string which shall become extended or filled by a Win32-API function - e.g. ::GetModuleFileName(NULL, path, cPath) to retrieve your executable's path - you may follow this approach:

std::string path;
path.resize(MAX_PATH);                 // adjust the internal buffer's size
                                       // to the expected (max) size of the
                                       // output-buffer of the Win32-API function
::GetModuleFileName( NULL, &path.front(), static_cast<DWORD>( path.size() ) );
                                       // Using the Win32-API
                                       // and the the string's internal buffer
path.resize( strlen( path.data() ) );  // adjust the string's length 
                                       // to the first \0 character
path.shrink_to_fit();                  // optional to adjust the string's
                                       // capacity - useful if you
                                       // do not plan to modify the string again

Unicode Version:

std::wstring path;
path.resize(MAX_PATH);                 // adjust the internal buffer's size
                                       // to the expected (max) size of the
                                       // output-buffer of the Win32-API function
::GetModuleFileName( NULL, &path.front(), static_cast<DWORD>( path.size() ) );
                                       // Using the Win32-API
                                       // and the the string's internal buffer
path.resize( wcslen( path.data() ) );  // adjust the string's length 
                                       // to the first \0 character
path.shrink_to_fit();                  // optional to adjust the string's
                                       // capacity - useful if you
                                       // do not plan to modify the string again

When you finally shrink-to-fit the string then you need just one more line of code when extending the string's internal buffer compared with the MFC alternative, when shrinking the string it has nearly the same overhead.

The advantage of the std::string approach in contrast to the CString approach is that you do not have to declare an additional C-String pointer variable, you just work with the official std::string methods and with one strlen/wcslen function. My approach shown above only works for the shrinking variant when the resulting Win32-API buffer is null-terminated, but for that very special case in which the Win32-API returns an unterminated string, then - similar to the CString::ReleaseBuffer method - you must explicitly know and specify the new string/buffer length by path.resize( newLength ) - just like path.ReleaseBuffer( newLength ) for the CString alternative.

Peter Rawytsch
  • 115
  • 1
  • 9
2
void GetString(char * s, size_t capacity)
{
    if (nullptr != s && capacity > 5)
    {
        strcpy_s(s,capacity, "Hello");
    }
}

void FooBar()
{
    std::string ss;
    ss.resize(6);
    GetString(&ss[0], ss.size());
    std::cout << "The message is:" << ss.c_str() << std::endl;
}

As you can see, you can use the the "old school c- pointer" both for feeding strings into a legacy function as well as use it as an OUT parameter. Of course, you need to make sure, there is enough capacity in the string for it to work etc.

BitTickler
  • 10,905
  • 5
  • 32
  • 53
  • Is it guaranteed to be possible, or is it a quirk of your standard library implementation? – user253751 Jun 21 '15 at 07:06
  • You seem to use "capacity" and "size" interchangeably, be careful with that since in the C++ library "capacity" and "size" means two different things. – Some programmer dude Jun 21 '15 at 09:30
  • Yes, I could have written the capacity / size thing better. From the point of view of function GetString(), the size of the std::string on caller side is the capacity. ;) – BitTickler Jun 22 '15 at 00:21
  • @immibis Yes, the &s[0] existed even in the old days before std::string defined members such as ``data()`` and always worked this way. – BitTickler Jun 22 '15 at 00:23
  • @BitTickler in C and C++, "it always worked that way" isn't sufficient. – user253751 Jun 22 '15 at 02:47
  • @immibis Who said in C? I did not. It always worked like this in even the oldest of STL std::string implementations and always will. And to confirm all you have to do is look up the implementation of operator&. – BitTickler Jun 22 '15 at 16:07
  • @BitTickler My programs have always either crashed or returned garbage if I dereference a NULL pointer. Doesn't mean you're guaranteed to get either a crash or garbage. – user253751 Jun 23 '15 at 01:57
  • @immibis If you don't trust your code and the implementation of ``std::string::operator&``, then don't use it ;) – BitTickler Jun 23 '15 at 15:29
  • @BitTickler Have you heard of undefined behaviour, by the way? Some things that work aren't required to work, and might stop working the day after your code goes into production, and then you'll be in trouble. – user253751 Jun 24 '15 at 01:31
  • @immibis tip: don't ask questions here if you know it all better. The &s[0] is standard std::string behavior since eternity not something cryptic. It is nothing out of the ordinary but **the normal way to do it** – BitTickler Jun 26 '15 at 14:53
1

Depending on your requirements, you can use one or more of the following:

  1. std::string::operator[](). This function returns a character at a given index without bounds checking.

  2. std::string::at(). This function returns a character at a given index with bounds checking.

  3. std::string::data(). This functions returns an const pointer to the raw data.

  4. std::string::c_str(). This function returns the same value as std::string::data()

R Sahu
  • 204,454
  • 14
  • 159
  • 270
  • 1
    Thanks. Yes, I see that I can get a `const` pointer to the string, but I can't modify it, can I? Otherwise it wouldn't be a `const`, would it? – c00000fd Jun 21 '15 at 04:40
  • That's right, you can't modify the string through the pointers returned by `data()` and `c_str()`. – R Sahu Jun 21 '15 at 04:42
  • 2
    Hmm. Interesting. People were telling me that `std::string` is much better than MFC's `CString`. I'm not sure if it's true after not being able to do the simplest operation. – c00000fd Jun 21 '15 at 04:44
  • If getting a writeable pointer to the raw data is the only criterion to judge the two, then `std::string` is not as good. However, if you elevate your requirements a bit higher and still can't do the things with `std::string` that you can do with `CString`, then we are talking about real problems. Remember that you can access the contents of the raw data using the first two functions. – R Sahu Jun 21 '15 at 04:48
  • 2
    You see I'm modifying the existing code. So I'm somewhat limited to the constraints of what was already written up. – c00000fd Jun 21 '15 at 04:50
-1

To lowercase a std::string containing only ASCII characters, you can use this code:

#include <algorithm>
#include <string> 

std::string data = "Abc"; 
std::transform(data.begin(), data.end(), data.begin(), ::tolower);

You really can't get around iterating through each character. The original Windows API call would be doing the same character iteration internally.

If you need to get toLower() for multi-byte encodings (e.g. UTF-8), or a locale other than the standard "C" locale, you can use instead:

std::string str = "Locale-specific string";
std::locale loc("en_US.UTF8");  // desired locale goes here
const ctype<char>& ct = use_facet<ctype<char> >(loc);
std::transform(str.begin(), str.end(), str.begin(), std::bind1st(std::mem_fun(&ctype<char>::tolower), &ct));

To answer your question directly and minus any context, you can call str.c_str() to get a const char * (LPCSTR) from a std::string. You cannot directly convert a std::string to a char * (LPTSTR); this is by design and would undermine some of the very motivations for using std::string.

Special Sauce
  • 5,338
  • 2
  • 27
  • 31
  • 2
    Please don't take it off track. I'm not asking about converting to lower case. It's just an example that I quickly put together. I'm asking about getting a writable buffer from std::string. – c00000fd Jun 21 '15 at 04:26
  • @c00000fd This is, I believe, just as much of an example as the code in your question. The string class and the standard library have everything you need really. – Some programmer dude Jun 21 '15 at 04:34
  • Yeah, I can get `const char *` but how do I get `char *` out of it? Do i need to allocate it myself and copy my string there. Wouldn't it be a waste of CPU cycles? – c00000fd Jun 21 '15 at 04:41
  • @c00000fd Only constant buffer is permitted for the data contained in `std::string`. Modifying this buffer would lead to undefined behavior in the `std::string`. All this is by design. If the features of `std::string` are not important to you, and the limitations are overly limiting for you, then you should not use `std::string`. – Special Sauce Jun 21 '15 at 04:52
  • @SpecialSauce: BTW, since you brought up that "better" way of lowering the string case. Just to let you know. I just tried your suggested method with `std::wstring` and a string of non-English characters and it failed to change the case for those. So please be aware of that. – c00000fd Jun 21 '15 at 04:59
  • @c00000fd I never claimed `std::string` was better; it all depends on your use cases and development priorities -_- But good point on the lack of Unicode support. Note you can get locale-specific `toLower()` binding using the technique in this answer: http://stackoverflow.com/a/1724567/1911540 – Special Sauce Jun 21 '15 at 05:06
  • 1
    Using `std::transform` as in your first example is wrong, because `::tolower` doesn't take a `char` but an `int`, and a `char` **must** first be converted to `unsigned char`. The version using C++ locales should be correct though. – Ulrich Eckhardt Jun 21 '15 at 09:09
  • @UlrichEckhardt The code as stated above is time-tested and works great. Remember that `std::string` is essentially just array of bytes internally. You can prove it for yourself here in C++14 code: https://ideone.com/fWKH3l – Special Sauce Jun 21 '15 at 09:15
  • I don't understand what you want to express with "`std::string` is essentially just array of bytes". That said, "works great" is one of the ugly faces of undefined behaviour. – Ulrich Eckhardt Jun 21 '15 at 10:04
  • @Special Sauce. Ulrich Eckhardt is right. Try replacing the A in your string with À (U+00C0) to see it fail. Your code works only for char codes < 128. – zzz Apr 13 '23 at 23:42