2

Coming from a Python and Javascript background, I really can't understand why updating the element of a list in Haskell isn't straightforward. I mean, why can't I write something as simple as myList[0] = "newValue"?

So, I have two questions:

  1. Why isn't it straightforward to update an element of a list in Haskell? I would appreciate some theoretical response here.
  2. How can I go about updating an element of a list? So far, my approach has been to re-create the entire list while updating the elements I am interested on. But this approach doesn't seem performant.
FTM
  • 1,887
  • 17
  • 34
  • 2
    Please limit [one question per post](https://meta.stackexchange.com/questions/222735/can-i-ask-only-one-question-per-post) – that other guy Nov 12 '21 at 21:51
  • 5
    `But this approach doesn't seem performant.`. A lot of people coming from other languages think that immutable data structures, and the consequent need to have functions that return new structures rather than update in place, will cause a big performance penalty. But it's frequently not true, because the old and new data structures can share memory. Eg updating the first element of a linked list doesn't mean creating a completely new copy, it just means creating a new first node that points to the same next node as the original. This would be a lot harder if the data structures were mutable. – Robin Zigmond Nov 12 '21 at 22:53
  • voting to close, question lacks focus. –  Nov 13 '21 at 00:57

2 Answers2

5

This doesn't have anything to do with lists. It is, by intention, not possible to update any values whatsoever at all in Haskell.

Why? Well, it's just the way the language is designed. You might as well ask why imperative languages permit updating values. I mean, you write

x = 3

and then you write

x = 4

wut? was the first one a lie then or what?? Surely, we should have been explicit about that we're referring to two different variables in time. This is just asking for bugs! Certainly it can't be worth it just to save some characters and enable some low-level optimisation that could also be achieved in other, safer ways?
...right, no?

Even in an imperative language, it actually doesn't make much sense to update values in (linked) lists – you need to traverse O (n) arguments anyway to even get to the one that should be changed. Creating a whole new list takes on average only twice as long as changing the old one, but unlike with an imperative update you never need to worry about whether something else still needed the old version because you never interfer with it anyway.

And linked lists are typically pretty slow anyway, so then worrying about the 2× factor doesn't make much sense either. However, GHC often optimises lists away completely so they're never actually built up in memory at all, and that's among the things that would be much harder for the compiler to guarantee if it had to worry about someone mutating the list somewhere else.

Updating an element in an array, that's a different story. And indeed updating values in arrays is quite common in Haskell too, for the applications where this brings an important performance gain. It is still not possible to update a value of array type, but it is possible to update a value through a reference to a monadically mutable array. This ends up looking quite similar to how you update arrays in an imperative language, though it is normally a bit verbose because all the array tooling needs to be imported from a library, rather than using built-in syntax.

import qualified Data.Vector.Mutable as VM

main :: IO ()
main = do
  myArray <- VM.generate 37 (const "this element isn't set yet")
  print =<< VM.read myArray 0
  VM.write myArray 0 "newValue"
  print =<< VM.read myArray 0      

Equivalent Python:

def main():
    myArray = ["this element isn't set yet"] * 37
    print(myArray[0])
    myArray[0] = "newValue"
    print(myArray[0])

However, very often you don't really need to update elements. In fact, we immediately see a problem here: you're indexing into an array. That means you need to ensure the index actually exists. In imperative language, this is so common that you hardly think about it, but in Haskell we really prefer total code, i.e. even if we accidentally swap two variables (like two different loop-indices) it should not give a runtime error, but preferrably a compiler one.

More often than not, if you're updating a single element, you'll also be updating other elements. In fact, very often you'll be sequentially updating all of them, and then there isn't much advantage anymore over simply building a new list from scratch that contains the updated values right away. And with that approach, there's very little that can go wrong, at least not at runtime.


There's a big caveat here: if somebody else still uses the old list, it means the garbage collector can't reclaim the memory. This is the reason why it's quite easy to get memory leaks in Haskell – IMO the single biggest problem of the language.

leftaroundabout
  • 117,950
  • 5
  • 174
  • 319
  • updating a linked list's node is _O(1)_ time and `0` aux space, when you already have the pointer to the node -- and in many algorithms, you do. copying is _O(n)_ and _O(n)_. printing (or searching through) all permutations of n-long linked list in optimal time (with "shrinking domains") is [`0` aux space with links mutation](https://stackoverflow.com/a/49907365/849891) and _O(n^2)_ aux space with copying (besides the _O(n)_ space for the _n_ nested loops / recursive calls and of course the n-long permutation itself). updating linked lists in place can be a _big_ advantage. – Will Ness Nov 13 '21 at 15:10
  • That's the type of explanation I was looking for. Thank you for the very detailed answer! – FTM Nov 14 '21 at 15:51
3

It should feel familiar: Python strings behave the same way. You can't do myString[0] = 'H', even though that's fine in other languages like C.

This makes writing code easier and helps to prevent bugs because you don't have to be constantly mindful of who modifies which string.

If you do want to "modify" a Python string? You wouldn't think twice about doing:

myString = 'H' + myString[1:]

Sure, it requires that you restructure the program a bit if you need to propagate such a change, but overall it's helpful and effective. Python does the same with numbers and tuples for the same reason.

Haskell aims to be even safer, and therefore does the same with lists. (And maps. And sets. And everything else.)

that other guy
  • 116,971
  • 11
  • 170
  • 194
  • It's not that similar though. Making some types mutable and others not barely makes anything more reliable, it just makes the semantics less consistent. Only if you _restrict_ the types for something to be immutable, do you gain something. In Python, the point of immutable types is mostly that you can use them as keys for a hash map. But I'd say this is a hack, it would be better if there was a general way of making values of arbitrary type immutable, like there is in C++. – leftaroundabout Nov 13 '21 at 15:49
  • Also, `'H' + myString[1:]` in Python is much worse than `'H' : tail myString` in Haskell. In Python it allocates a complete new string, copies most of the old contents over, and then GCs the old one. In Haskell it just prepends the original list sans head to the new letter – although this relies on the linked-list implementation, which is in most senses actually worse than Python's packed-array strings. For serious text processing, Haskell [has a dedicated library](https://hackage.haskell.org/package/text). – leftaroundabout Nov 13 '21 at 15:49