2

I just started learning Haskell and got my first project working today. Its a small program that uses Network.HTTP.Conduit and Graphics.Rendering.Chart (haskell-chart) to plot the amount of google search results for a specific question with a changing number in it.

My problem is that simple-http from the conduit package returns a monad (I hope I understood the concept of monads right...), but I only want to use the ByteString inside of it, that contains the html-code of the website. So until now i use download = unsafePerformIO $ simpleHttp url to use it later without caring about the monad - I guess that's not the best way to do that.

So: Is there any better solution so that I don't have to carry the monad with me the whole evaluation? Or would it be better to leave it the way the result is returned (with the monad)?

Here's the full program - the mentioned line is in getResultCounter. If things are coded not-so-well and could be done way better, please remark that too:

import System.IO.Unsafe
import Network.HTTP.Conduit (simpleHttp) 
import qualified Data.ByteString.Lazy.Char8 as L
import Graphics.Rendering.Chart.Easy
import Graphics.Rendering.Chart.Backend.Cairo

numchars :: [Char]
numchars = "1234567890"

isNum :: Char -> Bool
isNum = (\x -> x `elem` numchars) 

main = do
    putStrLn "Please input your Search (The first 'X' is going to be replaced): "
    search <- getLine
    putStrLn "X ranges from: "
    from <- getLine
    putStrLn "To: "
    to <- getLine
    putStrLn "In steps of (Only whole numbers are accepted):"
    step <- getLine
    putStrLn "Please have some patience..."
    let range = [read from,(read from + read step)..read to] :: [Int]
    let searches = map (replaceX search) range
    let res = map getResultCounter searches
    plotList search ([(zip range res)] :: [[(Int,Integer)]])
    putStrLn "Done."

-- Creates a plot from the given data
plotList name dat = toFile def (name++".png") $ do
    layout_title .= name
    plot (line "Results" dat)

-- Calls the Google-site and returns the number of results
getResultCounter :: String -> Integer
getResultCounter search = read $ filter isNum $ L.unpack parse :: Integer
    where url = "http://www.google.de/search?q=" ++ search
              download = unsafePerformIO $ simpleHttp url -- Not good 
              parse = takeByteStringUntil "<" 
                      $ dropByteStringUntil "id=\"resultStats\">" download

-- Drops a ByteString until the desired String is found
dropByteStringUntil :: String -> L.ByteString -> L.ByteString
dropByteStringUntil str cont = helper str cont 0
    where helper s bs n | (bs == L.empty) = L.empty
                        | (n >= length s) = bs
                        | ((s !! n) == L.head bs) = helper s (L.tail bs) (n+1)
                        | ((s !! n) /= L.head bs) = helper s (L.tail bs) 0

-- Takes a ByteString until the desired String is found
takeByteStringUntil :: String -> L.ByteString -> L.ByteString
takeByteStringUntil str cont = helper str cont 0
    where helper s bs n | bs == L.empty = bs
                        | n >= length s = L.empty
                        | s !! n == L.head bs = L.head bs `L.cons` 
                                                helper s (L.tail bs) (n + 1)
                        | s !! n /= L.head bs = L.head bs `L.cons` 
                                                helper s (L.tail bs) 0

-- Replaces the first 'X' in a string with the show value of the given value
replaceX :: (Show a) => String -> a -> String
replaceX str x | str == "" = ""
               | head str == 'X' = show x ++ tail str
               | otherwise = head str : replaceX (tail str) x
  • 1
    BTW, `Data.Char` offers `isDigit`, which is more convenient efficient than defining your own that way. – dfeuer Jun 02 '15 at 03:40
  • Ah thanks, I didn't know that function exists. – Gunther Rocket Jun 02 '15 at 11:50
  • @GuntherRocket: even if it didn't exist, you could have just inlined it: ``getResultCounter search = read $ filter (`elem`"0123456789") $ L.unpack parse``. No need to define it on the top level! – leftaroundabout Jun 04 '15 at 21:12

1 Answers1

16

This is a lie:

getResultCounter :: String -> Integer

The type signature above is promising that the resulting integer only depends on the input string, when this is not the case: Google can add/remove results from one call to the other, affecting the output.

Making the type more honest, we get

getResultCounter :: String -> IO Integer

This honestly admits it's going to interact with the external world. The code then is easily adapted to:

getResultCounter search = do
    let url = "http://www.google.de/search?q=" ++ search
    download <- simpleHttp url    -- perform IO here
    let parse = takeByteStringUntil "<" 
                      $ dropByteStringUntil "id=\"resultStats\">" download
    return (read $ filter isNum $ L.unpack parse :: Integer)

Above, I tried to preserve the original structure of the code.

Now, in main we can no longer do

let res = map getResultCounter searches

but we can do

res <- mapM getResultCounter searches

after importing Control.Monad.

chi
  • 111,837
  • 3
  • 133
  • 218
  • Thanks! It works now. I already tried something similar but I missed the mapM and thought id have to change all the functions that work with the ByteString. Anyway, now I understand monads a bit more. – Gunther Rocket Jun 02 '15 at 11:47