4

I am doing semantic analysis for an experimental language. I am using Alex and Happy to generate the lexer and the parser (actually I am using BNFC tool to generate Alex and Happy files). I wanted to get an error message with the line number and the column number whenever there is a semantic error, say, a type error.

It seems that I would have to store the line number information while building my symbol table or the AST. My problem would would be solved if I can somehow have access to the position information in the the rules sections of the Happy file.

Any suggestions in this regard would be highly appreciated.

I tried implementing the answer suggested below but unfortunately have not had any success with this. Lets consider for a very simple grammar :-

Expr -> Expr + Term
       | Term
Term -> Int

My lexer for this looks like below.

%wrapper "posn"

$digit = 0-9            -- digits
$alpha = [a-zA-Z]       -- alphabetic characters

tokens :-

  $white+               ;
  "--".*                ;
  $digit+               { \p s -> L {getPos = p , unPos = Tok_Int (read s) }}
  \+                    { \p s -> L {getPos = p , unPos = Tok_Plus} }


{
data L a = L{ getPos :: AlexPosn, unPos :: a } deriving (Eq,Show)

data Token =
      Tok_Plus 
    | Tok_Int Int 
    deriving (Eq,Show)


getToken :: IO [L Token]
getToken = do 
    args <- getArgs
    case length args == 0 of
        True  -> do 
               error $ "\n****************Error: Expecting file name as an argument.\n" 
        False -> do
            let fname  = args !! 0 
            conts <- readFile fname
            let tokens = alexScanTokens conts 
            return tokens 

}

My Yacc file is as under and this is where I am struggling. How to embed the position information in my syntax tree.

{
{-# OPTIONS_GHC -fno-warn-incomplete-patterns -fno-warn-overlapping-patterns #-}
module Parser where
import Lexer

}

%name pExpr Exp 
%name pTerm Term 

%tokentype {L Token}
%error { parseError }

%token
      int             { L { getPos = _,unPos = Tok_Int $$ } }
      '+'             { L { getPos = _,unPos = Tok_Plus } }

%%
Exp :: {L Expr} 
Exp  : Exp '+' Term           { L { getPos =  getPos $1 , unPos = EAdd (unPos $1) (unPos $3) } }
     | Term                   { $1 }

Term :: {L Expr}
Term : int                   { L {getPos =  getPos $1, unPos =  EInt (unPos $1) } } 

{

data Expr =  EAdd Expr Expr 
            | EInt Int 
            deriving (Eq,Show)


returnM :: a -> Err a
returnM = return

thenM :: Err a -> (a -> Err b) -> Err b
thenM = (>>=)


parseError :: [L Token] -> a
parseError _ = error "Parse error"

}

I get the following type errors when trying to compile the generated Haskell file.

Parser.hs:109:39:
    Couldn't match expected type `L a0' with actual type `Int'
    In the first argument of `getPos', namely `happy_var_1'
    In the `getPos' field of a record
    In the first argument of `HappyAbsSyn5', namely
      `(L {getPos = getPos happy_var_1,
           unPos = EInt (unPos happy_var_1)})'

Parser.hs:109:73:
    Couldn't match expected type `L Int' with actual type `Int'
    In the first argument of `unPos', namely `happy_var_1'
    In the first argument of `EInt', namely `(unPos happy_var_1)'
    In the `unPos' field of a record

Can you guys suggest me how to get this thing working?

user5803465
  • 343
  • 2
  • 7

1 Answers1

6

You can have access to the position information in a Happy rule, if they are available in your lexer output. This is exactly how e.g. GHC itself puts SrcLocs into its own internal representation of Haskell code.

Basically, you would use the posn Alex wrapper to inject position information into your token type:

data L a = L{ getPos :: AlexPosn, unPos :: a }

(so your Alex tokenizer would return L Token values); and then you combine the individual token positions in your Happy rule into a position for the nonterminal (so you could, for example, have a rule from Expr + Expr to L (combinedPosn [getPos $1, getPos $2, getPos $3] $ PlusExpr (unPos $1) (unPos $3).

Cactus
  • 27,075
  • 9
  • 69
  • 149