135

What the difference between LPCSTR, LPCTSTR and LPTSTR?

Why do we need to do this to convert a string into a LV / _ITEM structure variable pszText:

LV_DISPINFO dispinfo;  
dispinfo.item.pszText = LPTSTR((LPCTSTR)string);
John Sibly
  • 22,782
  • 7
  • 63
  • 80
nothingMaster
  • 1,353
  • 2
  • 9
  • 6

6 Answers6

147

To answer the first part of your question:

LPCSTR is a pointer to a const string (LP means Long Pointer)

LPCTSTR is a pointer to a const TCHAR string, (TCHAR being either a wide char or char depending on whether UNICODE is defined in your project)

LPTSTR is a pointer to a (non-const) TCHAR string

In practice when talking about these in the past, we've left out the "pointer to a" phrase for simplicity, but as mentioned by lightness-races-in-orbit they are all pointers.

This is a great codeproject article describing C++ strings (see 2/3 the way down for a chart comparing the different types)

John Sibly
  • 22,782
  • 7
  • 63
  • 80
  • I quickly scanned that article - seems great, adding it to my bookmarks and will read it as soon as I have time. – nothingMaster Nov 26 '08 at 17:26
  • 12
    @LightnessRacesinOrbit You are technically correct - although in my experience it is common practice to leave out the "pointer to a...." description for brevity when referring to string types in C++ – John Sibly Jun 04 '15 at 09:15
  • 3
    @JohnSibly: In C, yes. In C++, it absolutely shouldn't be!! – Lightness Races in Orbit Jun 04 '15 at 11:47
  • 4
    Notice that that codeproject article was written 15 years ago and, unless it gets updated, contains misleading assumptions about Unicode characters always being 2 bytes. That's entirely wrong. Even UTF16 is variable length... it is much better to say that wide characters are UCS-2 encoded, and that "Unicode" in this context refers to UCS-2. – u8it Oct 13 '17 at 19:45
  • It's a mess, Unicode characters were originally meant to be two bytes, but that turned out not to be enough. So UTF-16 was designed to shoehorn modern unicode into systems that were originally designed for 16 bit unicode. On modern windows a "wide string" is actually a sequence of UTF-16 code units. – plugwash Feb 01 '19 at 20:05
  • Of course the characters outside the basic multilingual plane are pretty rare, so most of the time you can get away with ignoring this detail. – plugwash Feb 01 '19 at 20:06
  • 1
    Hmm... in this case, @LightnessRacesinOrbit, I would add an addendum that it's okay to leave out the "pointer to a..." when referring to C-strings in C++, if-and-only-if referring specifically to (decayed) string literals, or when interfacing/working with code that's either written in C, relies on C types instead of C++ types, and/or has C linkage via `extern "C"`. Apart from that, yeah, it definitely should need either the "pointer" bit, or specific description as a C string. – Justin Time - Reinstate Monica Sep 11 '19 at 18:39
98

Quick and dirty:

LP == Long Pointer. Just think pointer or char*

C = Const, in this case, I think they mean the character string is a const, not the pointer being const.

STR is string

the T is for a wide character or char (TCHAR) depending on compiler options.

Bonus Reading

From What does the letter "T" in LPTSTR stand for?: archive

What does the letter "T" in LPTSTR stand for?

October 17th, 2006

The “T” in LPTSTR comes from the “T” in TCHAR. I don’t know for certain, but it seems pretty likely that it stands for “text”. By comparison, the “W” in WCHAR probably comes from the C language standard, where it stands for “wide”.

Ian Boyd
  • 246,734
  • 253
  • 869
  • 1,219
Tim
  • 20,184
  • 24
  • 117
  • 214
63

8-bit AnsiStrings

  • char: 8-bit character (underlying C/C++ data type)
  • CHAR: alias of char (Windows data type)
  • LPSTR: null-terminated string of CHAR (Long Pointer)
  • LPCSTR: constant null-terminated string of CHAR (Long Pointer Constant)

16-bit UnicodeStrings

  • wchar_t: 16-bit character (underlying C/C++ data type)
  • WCHAR: alias of wchar_t (Windows data type)
  • LPWSTR: null-terminated string of WCHAR (Long Pointer)
  • LPCWSTR: constant null-terminated string of WCHAR (Long Pointer Constant)

depending on UNICODE define

  • TCHAR: alias of WCHAR if UNICODE is defined; otherwise CHAR
  • LPTSTR: null-terminated string of TCHAR (Long Pointer)
  • LPCTSTR: constant null-terminated string of TCHAR (Long Pointer Constant)

So:

Item 8-bit (Ansi) 16-bit (Wide) Varies
character CHAR WCHAR TCHAR
string LPSTR LPWSTR LPTSTR
string (const) LPCSTR LPCWSTR LPCTSTR

Bonus Reading

TCHARText Char (archive.is)


Why is the default 8-bit codepage called "ANSI"?

From Unicode and Windows XP
by Cathy Wissink
Program Manager, Windows Globalization
Microsoft Corporation
May 2002

Despite the underlying Unicode support on Windows NT 3.1, code page support continued to be necessary for many of the higher-level applications and components included in the system, explaining the pervasive use of the “A” [ANSI] versions of the Win32 APIs rather than the “W” [“wide” or Unicode] versions. (The term “ANSI” as used to signify Windows code pages is a historical reference, but is nowadays a misnomer that continues to persist in the Windows community. The source of this comes from the fact that the Windows code page 1252 was originally based on an ANSI draft, which became ISO Standard 8859-1. However, in adding code points to the range reserved for control codes in the ISO standard, the Windows code page 1252 and subsequent Windows code pages originally based on the ISO 8859-x series deviated from ISO. To this day, it is not uncommon to have the development community, both within and outside of Microsoft, confuse the 8859-1 code page with Windows 1252, as well as see “ANSI” or “A” used to signify Windows code page support.)

Ian Boyd
  • 246,734
  • 253
  • 869
  • 1,219
  • 7
    Shame this answer will never make it to the top because it's so new.. that's really something SO needs to fix. This is the best answer by far. – Dan Bechard Apr 04 '18 at 07:23
  • 1
    This really helps me a lot while I am doing Unicode project at the work. Thanks! – Yoon5oo Jun 27 '18 at 12:41
  • 3
    Nice answer. I think it's worth adding that the unicode version uses UTF16, so each 16-bit chunk is not a character but a code-unit. The names are historical (when Unicode === UCS2). – Margaret Bloom Jan 29 '19 at 13:00
7

Adding to John and Tim's answer.

Unless you are coding for Win98, there are only two of the 6+ string types you should be using in your application

  • LPWSTR
  • LPCWSTR

The rest are meant to support ANSI platforms or dual compilations. Those are not as relevant today as they used to be.

Ajay
  • 18,086
  • 12
  • 59
  • 105
JaredPar
  • 733,204
  • 149
  • 1,241
  • 1,454
  • 2
    @BlueRaja, I was mainly referring to C based strings in my answer. But for C++ I would avoid `std::string` because it is still an ASCII based string and prefer `std::wstring` instead. – JaredPar May 12 '10 at 17:25
  • 1
    You should be using LPTSTR and LPCTSTR unless you are calling the ASCII (*A) or widechar (*W) versions of functions directly. They are aliases of whatever character width you specify when you compile. – osvein Jun 28 '17 at 14:59
  • 1
    ...And now that Microsoft is working on making the `*A` versions of WinAPI compatible with the UTF-8 code page, they're suddenly a lot more relevant. ;P – Justin Time - Reinstate Monica Sep 12 '19 at 21:04
  • In retrospect, it is now evident wchar_t was a mistake. MS should have gone with UTF-8. That's what most of the World is doing. Qt solves this beautifully with QString. – Pierre Jan 02 '22 at 18:36
5

To answer the second part of your question, you need to do things like

LV_DISPINFO dispinfo;  
dispinfo.item.pszText = LPTSTR((LPCTSTR)string);

because MS's LVITEM struct has an LPTSTR, i.e. a mutable T-string pointer, not an LPCTSTR. What you are doing is

1) convert string (a CString at a guess) into an LPCTSTR (which in practise means getting the address of its character buffer as a read-only pointer)

2) convert that read-only pointer into a writeable pointer by casting away its const-ness.

It depends what dispinfo is used for whether or not there is a chance that your ListView call will end up trying to write through that pszText. If it does, this is a potentially very bad thing: after all you were given a read-only pointer and then decided to treat it as writeable: maybe there is a reason it was read-only!

If it is a CString you are working with you have the option to use string.GetBuffer() -- that deliberately gives you a writeable LPTSTR. You then have to remember to call ReleaseBuffer() if the string does get changed. Or you can allocate a local temporary buffer and copy the string into there.

99% of the time this will be unnecessary and treating the LPCTSTR as an LPTSTR will work... but one day, when you least expect it...

AAT
  • 3,286
  • 1
  • 22
  • 26
  • 1
    You should avoid C style cast and use `xxx_cast<>()` instead. – harper Jun 27 '18 at 13:32
  • @harper You are quite right -- but I was quoting the OP, that is the code he was asking about. If I'd written the code myself it would certainly have used `xxx_cast<>` rather than mixing two different bracket-based casting styles! – AAT Oct 12 '18 at 12:59
0

The short answer to 2nd part of the question is simply that CString class doesn't provide a direct typecast conversion by design and what you are doing is kind of cheat.

A longer answer is the following:

The reason you can typcast CString to LPCTSTR is because CString provides this facility by overriding operator=. By design it provides conversion to only LPCTSTR pointer so the string value can't be modified with this pointer.

In other words, it simply doesn't provide an overload operator= to convert the CString into LPSTR for the same reason as above. They don't want to allow altering the string value this way.

So essentially, the trick is to use the operator CString provide and get this:

LPTSTR lptstr = (LPCTSTR) string; // CString provide this operator overload

Now LPTSTR can be further type casted to LPSTR :)

dispinfo.item.pszText = LPTSTR( lpfzfd); // accomplish the cheat :P

The correct way to get LPTSTR from 'CString' is this though (complete example):

CString str = _T("Hello");
LPTSTR lpstr = str.GetBuffer(str.GetAllocLength());
str.ReleaseBuffer(); // you must call this function if you change the string above with the pointer

Again because the GetBuffer() returns LPTSTR for that reason that now you can modify :)

zar
  • 11,361
  • 14
  • 96
  • 178