Why does Haskell's main function require IO operations?

Question

Haskell is a pure functional language, so aspects like File IO become a bit more complicated than most other programming languages because state can be stored in files. To address these issues, Haskell has a lot of additional syntax and operations to work with files. I understand that need and why it is necessary. What I am confused with, is why this is required for input and output from main?

A Haskell program which reads command line arguments and standard input and writes to standard output looks something like this:

import System.Environment

main = do
   progName <- getProgName
   args <- getArgs
   contents <- getContents  
   putStr (operateOn contents)

This is annoying because main is not a pure function and has to go through file IO operations in order to access inputs to the program. However, it is semantically "pure" in that it takes in a progName, args, and stdin while returning the stdout content. In my opinion, it would make far more sense for Haskell to recontextualize main to take the inputs as arguments and return the output. Consider the following hypothetical function:

pureMain :: String -> [String] -> String -> String
pureMain progName args stdin = operateOn stdin

This version of main is pure and accepts its inputs as parameters directly while returning the result. The values of each input do not change throughout the program's execution, and stdin can be a potentially-infinite string lazily evaluated. Laziness also works for stdout as each line can be printed individually without the requiring the entire potentially-infinite stdout to be evaluated.

I can understand some implementation details causing stdin and stdout to require file IO, as they are technically file descriptiors anyways. Things like logging also require the use of putStr, which would make this "return stdout as a string" model a little impractical.

Even with those problems however, I see no reason why progName and args can't be passed in as arguments. For that matter, even start time or a random number seed could be passed in as well. As long as they don't change mid-execution and provide a hook in the Haskell runtime to specify a particular time or seed, then it should be completely pure and repeatable.

This idea seems too obvious not to have been tried, so I'm curious why it was not done for Haskell? Is there something I'm missing that breaks down this model which requires these to be IO-aware calls?

Yes ideas like this were tried early in Haskell's history. You might enjoy [Tackling the Awkward Squad](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/mark.pdf), a paper motivating the design of monadic IO, and also [A Hisory of Haskell: Being Lazy with Class](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/history.pdf?from=http%3A%2F%2Fresearch.microsoft.com%2Fen-us%2Fum%2Fpeople%2Fsimonpj%2Fpapers%2Fhistory-of-haskell%2Fhistory.pdf) — luqui, Aug 30 '17 at 23:28
Why is it that you have the feeling that hiding the `IO` part from the programmer, I very much like to have `IO actions` tagged and know that I have to worry about execution order, and possible IO-Exceptions which might fail. I see no problems in coding your functions pure - which you named `operateOn` - and tie in the io actions as you did in your first example - if you want to you can use applicative notation to make it feel less imperative `putStrLn =<< (operateOn <$> getProgName <*> getArgs <*> contents)`. — epsilonhalbe, Aug 30 '17 at 23:29
Where does `readFile` fit in this model? What about `getHTTP`? `openGUIWindow`? `runSQLQuery`? Where do you draw the line and stop accepting more arguments to `main` when somebody comes along with another idea for a useful impure thing that should be turned pure by becoming an argument? — Daniel Wagner, Aug 30 '17 at 23:37
See also: [I/O is not a monad](http://r6.ca/blog/20110520T220201Z.html). — Daniel Wagner, Aug 30 '17 at 23:39
A *lot* of things wouldn't fit into that pattern, such as any interactive command line program. — David Young, Aug 30 '17 at 23:44
I understand that this model does not address file IO as a general concept. Reading and writing files directly would still need to be done using Haskell's existing syntax. IO actions are still necessary for those cases. My question is why the additional syntax and complexity is necessary specifically for the `main` function. I feel like it already fits the pure functional model, the Haskell committee just chose not to use it as such. — Douglas Parker, Aug 30 '17 at 23:45
@DouglasParker You cannot "take a value out of" an IO action. That is an important aspect of how purity is maintained. So, it would be impossible to use any of those things in a `main` that isn't an IO action, such as the one you are suggesting. — David Young, Aug 30 '17 at 23:46
If I'm understanding correctly, then a program which did standard file IO would require a `main` that returned an IO action. As a result, even programs which do _not_ perform file IO still need to return an IO action. Ok, I think I get why stdout wouldn't work in this model. But couldn't you still do something like `main :: String -> [String] -> String -> IO a` and still pass inputs as direct arguments? — Douglas Parker, Aug 30 '17 at 23:53
`main` doesn't return an IO action; it *is* an IO action, one that you create by composing other IO actions together. Your last question assumes that such a function would obtain *all* of its standard input before the program begins executing; that wouldn't allow the input to change in response to anything the program does *while* executing. As an example, your program couldn't ask you for a number `n`, then ask you to enter `n` words. — chepner, Aug 31 '17 at 01:53
@DouglasParker Yes, they could have done that. It would be more inconvenient when you're *not* writing I/O programs of course. You also can't pass standard input contents as a string because the program might not want to read the whole string before it does anything, and because lazy I/O is terrible. — user253751, Aug 31 '17 at 06:05
Ok, so interactive programs require input and output to be interleaved in a manner that wouldn't work in a call-return model. I see how stdin and stdout would be too limited to be practical. Is there any particular reason `progName` and `args` are applied as IO operations then? I don't see any technical reason why those values could not have been arguments to `main` directly. — Douglas Parker, Aug 31 '17 at 17:48
...well, [some C implementations](https://stackoverflow.com/q/10321435) also allow `main()` to have an `envp` parameter - should `main` in Haskell also have a parameter for that purpose? If "*yes*" then, as Daniel Wagner has already noted: where does this stop? — atravers, Sep 27 '21 at 02:42
As for a technical reason: perhaps because the original type signature for `main` was `main :: Dialogue`, not `main :: String -> [String] -> Dialogue` (or for that matter, `main :: String -> [String] -> [String] -> Dialogue` - hey, that `envp`-style parameter *could* be useful :-). — atravers, Sep 27 '21 at 03:00

Why does Haskell's main function require IO operations?

0 Answers0