2

Am curious to know what the difference is with string allocation in c++ compared to pascal.

How do the strings get allocated?

C++ also has char arrays/char*/const char*, how do these all differ in their allocation and use?

Jack McCall
  • 143
  • 2
  • 10
  • 1
    You can figure out the C++ side from somewhere [here](https://stackoverflow.com/a/388282/3484570). – nwp Jan 22 '18 at 13:22
  • 3
    This question has very well defined narrow answers. It's not broad at all, not to mention "too broad". – Cheers and hth. - Alf Jan 22 '18 at 13:27
  • 3
    I've voted to reopen because I'm sure I'll learn something from this question. – YSC Jan 22 '18 at 13:28
  • @Cheersandhth.-Alf _"Avoid asking multiple distinct questions at once."_ – GrumpyCrouton Jan 22 '18 at 13:28
  • @GrumpyCrouton _avoid_ does not mean _do not_. If this question is unperfect, you can [edit] it to improve it. – YSC Jan 22 '18 at 13:29
  • @YSC No, it means _"keep away from or stop oneself from doing (something)."_, meaning you _shouldn't do it_, otherwise the question is likely to be closed. The question can still be edited regardless of it being on hold or not, once the question is edited to be more on-topic, it will be taken off hold. – GrumpyCrouton Jan 22 '18 at 13:30
  • 6
    @Cheersandhth.-Alf Answering this question would mean you write an article on how C-strings work and how Pascal strings work and the differences between them. There is no bound to such an answer and you can never be done. The question therefore fits the "too broad" close-reason very well. – nwp Jan 22 '18 at 13:32
  • @nwp: No, writing an article is not necessary. If you believe that then you don't have the necessary expertise. That is, arguing from ignorance. – Cheers and hth. - Alf Jan 22 '18 at 13:34
  • Delphi strings are copy on write (COW) with length and ref count as preamble. They could be zero based or 1 based to add to the confusion. – LU RD Jan 22 '18 at 13:34
  • @LURD: Delphi strings are not necessarily Pascal strings. Visit WIkipedia to learn what the latter means. – Cheers and hth. - Alf Jan 22 '18 at 13:35
  • 1
    @Cheersandhth.-Alf, exactly. That is why I stated Delphi strings. – LU RD Jan 22 '18 at 13:36
  • @nwp For what it's worth I agree with you. This is apples and oranges. – Ron Jan 22 '18 at 13:39
  • Wow so much dispute over what was a question intended to further my knowledge – Jack McCall Jan 22 '18 at 15:12
  • Also while there are multiple distinct questions, they are both linked quite closely. 1) difference between c++ and pascal string allocation 2) how are strings actually allocated. Both questions can be answered together without that much effort. I'm not asking for an essay or a 100,000 word report. Merely a basic summary. If none can be given than i really don't think any derogatory comments are required – Jack McCall Jan 22 '18 at 15:16
  • 1
    @Cheers: I do understand and know both kinds of strings very well and that is why I know this question is too broad and off-topic here. I vote to close it. Note that Pascal strings are handled differently in different Pascal implementations, and that e.g. Delphi and FreePascal have several different string implementations (ShortString, AnsiString, UnicodeString, WideString) which are allocated and used differently, so this is an extremely broad topic. But also C++ has several string types: the char* type, the std::string and std::basic_string types, etc.etc. Far too broad. – Rudy Velthuis Jan 22 '18 at 18:25
  • @JackMcCall: no, they can not be answered without much effort. The topic is much broader than you think. – Rudy Velthuis Jan 22 '18 at 19:37

2 Answers2

3

A string that consists of a length followed by a sequence of character codes is called a Pascal string. It's more descriptively called a length-prefixed string. For example, a string created with the Windows API's SysAllocString function, is a length-prefixed string, a.k.a. Pascal string.

A C++ raw string literal instead consists of character codes followed by a nullvalue, a zero terminated string.

As of C++11 and later C++ std::string has a buffer that can be viewed as a zero-terminated string, but it also has a separate explicit length. It's not specified where either the length or the buffer is stored. This varies between implementations.


Storage for a zero-terminated string or Pascal string can be allocated in any way you wish, dynamically or as a local variable.

With a C++ std::string the buffer must in general be dynamically allocated, via the the standard allocator that std::string is equipped with, because the string can be abritrarily large, and because there is no way for client code to supply a buffer.

However, unlike a std::vector there are no requirements on std::string that prohibit a fixed size buffer for small enough strings, and so many (most?) implementations now provide the short string optimization. For a short enough string value everything can then be fit directly within the std::string object. E.g. as a local variable.


There is a C++11 and later constant time requirement on operator[] for std::string, which effectively prevents the COW (Copy On Write) shared ownership strategy used by some C++03 implementations.

Cheers and hth. - Alf
  • 142,714
  • 15
  • 209
  • 331
  • 1
    That is a horrible wiki paragraph, and entirely unsourced. Obviously the author never knew much about pascal strings. – Marco van de Voort Jan 22 '18 at 19:37
  • @MarcovandeVoort: I agree with what you literally write. The author of the Wikipedia para seems to limit the concept to an octet-sized length prefix, as in Turbo Pascal, so I included the Windows `SysAllocString` example as a little counter-weight. There is not bad SO answer about Pascal strings (the general concept, not the literal "strings used in Pascal implementations") (https://stackoverflow.com/questions/25068903/what-are-pascal-strings). – Cheers and hth. - Alf Jan 22 '18 at 20:10
  • @MarcovandeVoort: That said, I learned Pascal in 1982, on the HP3000, when with most implementations we used `packed array of char` as strings. However, the second year at college we got to use a DEC Rainbow little workstation with UCSD p-Pascal. As I recall with a `string` type... :-) – Cheers and hth. - Alf Jan 22 '18 at 20:14
  • If you mean the first reply there, I agree for obvious reasons. It is also why the link "UCSD Pascal" in my post links to the article that you named. Btw, what was the way of ending in your Hp3000 pascal? Space padding ? – Marco van de Voort Jan 22 '18 at 20:14
  • Oh, that's you. Sorry, I didn't notice. Need coffee! – Cheers and hth. - Alf Jan 22 '18 at 20:15
  • @MarcovandeVoort: It must have been, because I don't remember any use of zero-termination or like that. We used mainly two Pascal versions on the HP3000, one that I think was from HP, and the other from Robelle Consulting in Canada. I think. It's long ago. – Cheers and hth. - Alf Jan 23 '18 at 04:22
1

There are multiple implementations of Pascal strings. The Turbo Pascal string is mostly statically allocated, and the string types that are new in Delphi are dynamic. Delphi strings have a null at the end (but are not null-terminated, the strings can contain null characters), and Turbo Pascal are not. Delphi has 4 or 5 such types, including the Turbo Pascal one.

However both adhere to the same rough template that UCSD Pascal (of bytecode interpreter fame) coined.

In a lot of C-centric literature "Pascal Strings" is usually about one of the key characteristics, storing the length of a string so that retrieving the length or a pointer to the last character is an O(1) operation.

In addition, Delphi/Free Pascal also can fully emulate manual C strings, since that is basically a library construct apart from literal assignment.

Marco van de Voort
  • 25,628
  • 5
  • 56
  • 89
  • Thanks :) that's given me a good starting point – Jack McCall Jan 22 '18 at 15:15
  • You might want to read this article: ["PChars: no strings attached"](http://rvelthuis.de/articles/articles-pchars.html) too. It is about Delphi's string types and how they relate to C-style "strings", but should be useful for all kinds of Pascal implementation. – Rudy Velthuis Jan 22 '18 at 18:13