16

With Delphi XE4 for the iOS platform a new string type was introduced: Immutable zero based strings. So far Delphi had copy on write mutable strings. So the question is, what does that mean for my future programming? Are there any advantages of one string type over the other? What are the pitfalls I need to take care of when switching to the new string type (Other than the obvious 0 vs 1 base)?

iamjoosy
  • 3,299
  • 20
  • 30
  • Note that it is a compiler switch if the strings are zero-based. Certainly not a good idea to mix both ways but at least you can decide yourself when to do the transition. – jpfollenius Apr 25 '13 at 10:40
  • We are supposed to use `TStringBuilder` class for string manipulation and `TStringHelper` for general string handling. – LU RD Apr 25 '13 at 10:41
  • The `String` type itself is actually the same as before, it is only the `[]` operator that is affected. Immutability simply means the compiler does not allow a `Char` to be assigned to the operator, and the `{$ZEROBASEDSTRINGS}` directive only affects how the compiler interprets indexes that are passed to the operator. The `String` type itself was not redesigned. – Remy Lebeau Apr 25 '13 at 15:40
  • @Remy The iOS compiler appears to accept assignment using `[]` operator if Marco's whitepaper is to be believed. – David Heffernan Apr 25 '13 at 16:03

1 Answers1

17

According to Marco Cantù's whitepaper, the string data type in the XE4 iOS target is not in fact immutable, although he seems to contradict himself.

He says:

In the new Delphi LLVM-based compiler, there is one string type, representing Unicode strings (UTF16), and mapped to the current string type in Delphi XE3 (an alias for the UnicodeString type on the Windows compiler). However, this new string type uses a different memory management model. The string type is still reference counted, but it is immutable, which means you cannot modify the string contents once it is constructed.

But he then goes on to say:

In other words strings are now Unicode-based, soon-to-become immutable, and reference-counted.

And also:

Where things start to change, however, is when you modify an existing string, not by replacing it with a new value (in which case you get a brand new string) but when you modify one of its elements, as shown in this line of code (and also in the previous section, where I introduced the topic):

Str1 [3] := 'x';

All Delphi compilers use a copy-on-write semantics: If the string you modify has more than one reference, it is first copied (adjusting the reference counts of the various strings involved as required) and later modified.

The new compiler does something very similar to the classic one. It implements a copy-on-write mechanism, unless there is a single reference to the string, in which case the string gets modified in place. As an example, consider the following code, which outputs the in-memory location of the actual string.

And he then shows a picture of an iOS device with mutating strings.

And in the official documentation we have:

Strings are immutable (constant), so you cannot index into a string as an array and manipulate the characters in a string. If you attempt to modify a string, the Delphi mobile compilers might emit the message W1068 Modifying strings in place may not be supported in the future (Delphi). You can specify whether the message x1068 is emitted as a warning or an error. In the Hints and Warnings page, set the warning "Modifying strings in-place...." to "true" or "error".

So I interpret all that as meaning that the XE4 release of the iOS compiler still has mutable strings. The developers really don't want you to mutate your strings any more and are telling you that strings are immutable on the mobile compilers. But they do appear still to be mutable. Go figure!


However, you have been served notice that in a future release, the string may become immutable.

You can prepare for that future release now by setting

{$WARN IMMUTABLE_STRINGS WARN}

which will give you an idea of the impact of the change. If you want to buckle up and stop mutating strings, you can do this:

{$WARN IMMUTABLE_STRINGS ERROR}

Once you do that you'll need to convert code that accesses individual string elements. I suspect you'll be surprised by how little such code there is. I just compiled 600,000 lines of code and saw only 120 instances of the warning. And most of those were in third party units. I've seen quite a stir about this change, but I honestly don't believe that very much code mutates strings. In the overwhelming majority of cases strings are built up by concatenation, or by calls to functions like Format. That code is not affected by this.

I don't think there are any great pitfalls. You can use {$WARN IMMUTABLE_STRINGS ...} to let the compiler guide you through the process. Any code that mutates strings should be converted to use TStringBuilder.

As for the benefits of immutability, I refer you to Why .NET String is immutable?

If you are using the traditional Windows or OSX compilers then I see no compelling reason to change. The iOS compiler is brand new. The change to immutable strings has been floated, but it may never happen. It may happen only on the mobile compilers and never on the traditional compilers. Right now, I would sit tight, and wait to see how it all plays out.

Community
  • 1
  • 1
David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • 3
    Tim Anderson is also confused as to how immutable the new immutable mobile string really is: http://www.itwriting.com/blog/7347-changes-in-the-delphi-language-for-arm-and-mobile-support.html – David Heffernan Apr 25 '13 at 11:35
  • 3
    Woah, I'm a bit afraid to jump to conclusions but this sounds like another bad idea by Delphi developers. Let's break compability to make strings less useful, great plan. – himself Apr 25 '13 at 12:30
  • 3
    @himself There are benefits with immutability. And in any case, it's a very low impact change. Neglecting third party libs which somebody else will deal with, I could fix my codebase in around an hour. I'd rather that some of the cruft of the language was removed, even if it means some pain on our side. – David Heffernan Apr 25 '13 at 12:49
  • 1
    Hefferman: Maybe for you... I can remember thousands and thousands of functions which relied on that. And anyway, what major benefits there are which outweigh the benefits of mutability? How fast mutable copy-on-write built-in strings are cruft? – himself Apr 25 '13 at 12:51
  • @himself Thousands and thousands? I don't believe that at all. What sort of code does `s[i] := ...`? – David Heffernan Apr 25 '13 at 13:12
  • `The main benefit of immutable data types is that they are inherently threadsafe.` With compiler-guaranteed copy-on-write semantics implemented through atomic operations, Delphi's (mutable) strings have always been threadsafe. So that's no benefit here. – Mason Wheeler Apr 25 '13 at 13:23
  • @MasonWheeler That's not quite accurate. If one thread is writing to the string whilst another thread reads, you have a race. It all depends on what you mean by threadsafe. Of course, I was vague in my answer!! – David Heffernan Apr 25 '13 at 13:29
  • 2
    @David Heffernan: Anything which uses strings as fast copy-on-write storage for one - seen tons of that. Any function which preallocates the string then populates with data. (In fact, Delphi strings have always been pre-allocating some space, but now that they're immutable I wonder if s := s + NextChar will be extremely slow) – himself Apr 25 '13 at 13:32
  • @himself Concatenation will be just the same as before. What code preallocates strings and then puts content in character by character? – David Heffernan Apr 25 '13 at 13:35
  • @David Heffernan: Any code which builds the string char by char and doesn't want to be slow? Rewriting the char is one operation, appending char to a string, hundreds. (And possibly, reallocation) – himself Apr 25 '13 at 13:42
  • @himself What code builds strings character by character? Whatever it is, `TStringBuilder` is probably cleaner anyway. But I can't think of a good example of such code. – David Heffernan Apr 25 '13 at 13:43
  • @David Heffernan: replChar(s,'a','b'), bintohex(s), htmlescape(s), urlencode(s), upcase(s). And TStringBuilder is a *class*. Its instantiation is going to cost more than the rest of the function combined in some cases... – himself Apr 25 '13 at 13:48
  • @himself Most of those are in the RTL and presumably they will be blessed in some way. Most other languages manage fine with immutable strings. – David Heffernan Apr 25 '13 at 13:51
  • @David Heffernan: I still don't understand why Delphi needs to drop support for something good because "other languages manage without that". – himself Apr 25 '13 at 13:56
  • @himself Because there is a point of view that mutable data types are not good – David Heffernan Apr 25 '13 at 14:13
  • @DavidHeffernan: We do (pre-allocate and then populate). Not necessarily character by character but certainly in somewhat bigger chunks. For example when exploding delimited utf-8 to fixed width utf-16 buffers... – Marjan Venema Apr 25 '13 at 18:04
  • @MarjanVenema What I cannot fathom is how you process UTF8 on the iOS platform. In a byte array? – David Heffernan Apr 25 '13 at 18:08
  • @DavidHeffernan: No clue whatsoever :-) We are not even contemplating moving to XE4 yet, let alone playing around with iOS. Was just responding to your "what code builds...?" Delimited to fixed width just happens to be utf-8 to utf-16, but it could also have been Ansi or OEM or utf-7 or whatever to utf-16, the main aspect being the delimited to fixed width. – Marjan Venema Apr 25 '13 at 18:14
  • @MarjanVenema You can move to XE4 and ignore this. My prediction is that the traditional compilers will be going strong in 10 years time! They couldn't kill `object` and `ShortString` yet. Let's walk before we try to run! – David Heffernan Apr 25 '13 at 18:21
  • @MarjanVenema UTF-8 means `TEncoding.GetBytes` and `TEncoding.GetString`. Before too long there will be no reason left to move to .net! ;-) – David Heffernan Apr 25 '13 at 18:50
  • @DavidHeffernan: yeah, I am not exactly worried. It isn't the reason we are not considering moving yet. That is simply because XE2 update 4 can't even compile our code (while update 3 can) and we haven't seen anything in XE3 or 4 that's worth our while. Perhaps with the exception of helpers for built-in types, but that in itself isn't worth shelling out for an upgrade. TEncoding is one of my friends. Spent long hours with him when we moved to D2009. Thought you meant that it would be different on iOS. – Marjan Venema Apr 25 '13 at 19:10
  • I cannot imagine a single reason in the world why they would want to take away such a common ability. Very shocked. – Jerry Dodge Apr 25 '13 at 21:01
  • @Jerry Try this: http://javarevisited.blogspot.com/2010/10/why-string-is-immutable-in-java.html?m=1 or this http://stackoverflow.com/questions/2365272/why-net-string-is-immutable – David Heffernan Apr 25 '13 at 21:07
  • @DavidHeffernan: being working with .NET for 3 years I agree with Jerry. I had read the dotnet link, and string think COW strings are better. I really don't like dotNet string handling. – Fabricio Araujo May 10 '13 at 19:47