IO
is the way how Haskell differentiates between code that is referentially transparent and code that is not. IO a
is the type of an IO action that returns an a
.
You can think of an IO action as a piece of code with some effect on the real world that waits to get executed. Because of this side effect, an IO action is not referentially transparent; therefore, execution order matters. It is the task of the main
function of a Haskell program to properly sequence and execute all IO actions. Thus, when you write a function that returns IO a
, what you are actually doing is writing a function that returns an action that eventually - when executed by main
- performs the action and returns an a
.
Some more explanation:
Referential transparency means that you can replace a function by its value. A referentially transparent function cannot have any side effects; in particular, a referentially transparent function cannot access any hardware resources like files, network, or keyboard, because the function value would depend on something else than its parameters.
Referentially transparent functions in a functional language like Haskell are like math functions (mappings between domain and codomain), much more than a sequence of imperative instructions on how to compute the function's value. Therefore, Haskell code says the compiler that a function is applied to its arguments, but it does not say that a function is called and thus actually computed.
Therefore, referentially transparent functions do not imply the order of execution. The Haskell compiler is free to evaluate functions in any way it sees fit - or not evaluate them at all if it is not necessary (called lazy evaluation). The only ordering arises from data dependencies, when one function requires the output of another function as input.
Real-world side effects are not referentially transparent. You can think of the real world as some sort of implicit global state that effectual functions mutate. Because of this state, the order of execution matters: It makes a difference if you first read from a database and then update it, or vice versa.
Haskell is a pure functional language, all its functions are referentially transparent and compilation rests on this guarantee. How, then, can we deal with effectful functions that manipulate some global real-world state and that need to be executed in a certain order? By introducing data dependency between those functions.
This is exactly what IO does: Under the hood, the IO type wraps an effectful function together with a dummy state paramter. Each IO action takes this dummy state as input and provides it as output. Passing this dummy state parameter from one IO action to the next creates a data dependency and thus tells the Haskell compiler how to properly sequence all the IO actions.
You don't see the dummy state parameter because it is hidden behind some syntactic sugar: the do
notation in main
and other IO actions, and inside the IO
type.