51

I have just recently started learning Haskell and I am having a lot of trouble trying to figure out how file reading works.

For example, I have a text file "test.txt" containing lines with numbers:

32 4
2 30
300 5

I want to read each line and then evaluate each word and add them.

Thus, I am trying to do something like this:

import System.IO
import Control.Monad

main = do
        let list = []
        handle <- openFile "test.txt" ReadMode
        contents <- hGetContents handle
        singlewords <- (words contents)
        list <- f singlewords
        print list
        hClose handle

f :: [String] -> [Int]
f = map read

I know this is completely wrong, but I don't know how to use the syntax correctly at all.

Any help will be greatly appreciated as well as links to good tutorials that have examples and explanation of code except this one which I have read fully.

pigrammer
  • 2,603
  • 1
  • 11
  • 24
DustBunny
  • 860
  • 2
  • 11
  • 25

3 Answers3

88

Not a bad start! The only thing to remember is that pure function application should use let instead of the binding <-.

import System.IO  
import Control.Monad

main = do  
        let list = []
        handle <- openFile "test.txt" ReadMode
        contents <- hGetContents handle
        let singlewords = words contents
            list = f singlewords
        print list
        hClose handle   

f :: [String] -> [Int]
f = map read

This is the minimal change needed to get the thing to compile and run. Stylistically, I have a few comments:

  1. Binding list twice looks a bit shady. Note that this isn't mutating the value list -- it's instead shadowing the old definition.
  2. Inline pure functions a lot more!
  3. When possible, using readFile is preferable to manually opening, reading, and closing a file.

Implementing these changes gives something like this:

main = do  
        contents <- readFile "test.txt"
        print . map readInt . words $ contents
-- alternately, main = print . map readInt . words =<< readFile "test.txt"

readInt :: String -> Int
readInt = read
Daniel Wagner
  • 145,880
  • 9
  • 220
  • 380
  • oh wow, thank you :) But I didnt need to print the list I actually want to keep it as a list, because I was going to add each line together, and get the total, but thank you very much for your help! – DustBunny Oct 23 '11 at 17:25
15

Daniel Wagner's solution is a great one. Here is another swing at it so you can get some more ideas about efficient file handling.

{-#  LANGUAGE OverloadedStrings #-}
import System.IO
import qualified Data.ByteString.Lazy.Char8 as B
import Control.Applicative
import Data.List

sumNums :: B.ByteString -> Int
sumNums s = foldl' sumStrs 0 $ B.split ' ' s

sumStrs :: Int -> B.ByteString -> Int
sumStrs m i = m+int
              where Just(int,_) = B.readInt i

main = do 
  sums <- map sumNums <$> B.lines <$> B.readFile "testy"
  print sums

First, you'll see the OverloadedStrings pragma. This allows use to just use normal quotes for string literals that are actually bytestrings. We will be using Lazy ByteStrings for processing the file for several reasons. First, it allows us to stream the file through the program rather than forcing it all into memory at once. Also, bytestrings are faster and more efficient than strings in general.

Everything else is pretty much straightforward. We readFile the file into a lazy list of lines, and then map a summing function over each of the lines. The <$> are just shortcuts to allow us to operate on the value inside of IO() functor -- if this is too much I apologize. I just mean that when you readFile you don't get back a ByteString, you get back a ByteString wrapped in IO an IO(ByteString). The <$> says "Hey' I want to operate on the thing inside the IO and then wrap it back up.

B.split separates each line into numbers based on whitespace. (We could also use B.words for this) The only other interesting part is the in sumStrs we use deconstruction/pattern matching to extract the first value out of the Just that is returned by the readInt function.

I hope this was helpful. Ask if you have any questions.

Erik Hinton
  • 1,948
  • 10
  • 15
  • 3
    Thank you! Your way has a lot more syntax that I'm not familiar with, but I will use it for reference when I get a chance to expand my knowledge for Haskell. (just starting) :) – DustBunny Oct 23 '11 at 17:37
1

For all you non-functional programmers out there here is a treat

unsafePerformIO . readFile $ "file.txt"

Reads a file into a string

No IO String, just a normal fully loaded string ready for use. This might not be the right way but it works and no need to change your existing functions to suit IO String

p.s. Dont forget to import

import System.IO.Unsafe 
Kapytanhook
  • 854
  • 12
  • 12
  • 3
    Thanks. Makes experimenting on GHCi much easier. – Daniel C. Sobral Aug 22 '13 at 19:33
  • 7
    "no need to change your existing functions to suit `IO String`" there's no such need anyway. Any pure function should be kept pure regardless, and if you use it with `IO` then you just lift it into that monad with `fmap` (or a do block with `x <- someIOAction`). – leftaroundabout Jul 28 '14 at 12:59
  • 22
    Yes, because the `unsafe` shouldn't ring any bells :) This is a bad idea. In GHCi you can use `s <- readFile "file.txt"` to get the contents in `s`, no need for `unsafe*` functions. – Mihai Maruseac Jul 28 '14 at 13:08
  • 8
    I think it is VERY BAD advice for people learning Haskell to introduce them to `unsafePerformIO`. I work with Haskell for more than a decade and have never seen a use of `unsafePerformIO` in an ordiary (i.e. not system level) program. Beginners should first master the core of Haskell before venturing to `unsafe`. – user855443 May 05 '17 at 10:21
  • 5
    I agree, I never claimed this to be a good solution and I agree you should start with the core. When I wrote this I was still learning Haskell, I needed to get this done and this was a solution that worked for me. readfile returns IO and I wanted a properly loaded string. I should have done it right but I didn't. And now I'm helping some people out that should do it the right way but don't want to. But for anyone else that wants a few up-votes, just leave another comment reminding people how VERY BAD AND EVIL this is. – Kapytanhook May 11 '17 at 14:24
  • 1
    @MarcoFaustinelli If there is a better and safer solution why not use it instead? – Dharman Oct 02 '19 at 19:08
  • 1
    @Dharman - the words of Kapytanhook explain perfectly why. Anyhow, as you can see the haskell police found it suitable to remove my comment altogether. Words like "patronizing" and "sanctimonious" aren't allowed in Newspeak. – Marco Faustinelli Oct 03 '19 at 06:38