5

I'm writing a compiler in Haskell, so we have a lot (or at least it seems like a lot for me) of datas and constructors, such as the followings:

data DataType
    = Int | Float | Bool | Char | Range | Type
    | String Width
    | Record (Lexeme Identifier) (Seq Field) Width
    | Union  (Lexeme Identifier) (Seq Field) Width
    | Array   (Lexeme DataType) (Lexeme Expression) Width
    | UserDef (Lexeme Identifier)
    | Void | TypeError  -- For compiler use


data Statement
    -- Language
    = StNoop
    | StAssign (Lexeme Access) (Lexeme Expression)
    -- Definitions
    | StDeclaration      (Lexeme Declaration)
    | StDeclarationList  (DeclarationList Expression)
    | StStructDefinition (Lexeme DataType)
    -- Functions
    | StReturn        (Lexeme Expression)
    | StFunctionDef   (Lexeme Declaration) (Seq (Lexeme DataType))
    | StFunctionImp   (Lexeme Identifier)  (Seq (Lexeme Identifier)) StBlock
    | StProcedureCall (Lexeme Identifier)  (Seq (Lexeme Expression))
    -- I/O
    | StRead  (Seq (Lexeme Access))
    | StPrint (Seq (Lexeme Expression))
    -- Conditional
    | StIf   (Lexeme Expression) StBlock StBlock
    | StCase (Lexeme Expression) (Seq (Lexeme When))      StBlock
    -- Loops
    | StLoop     StBlock (Lexeme Expression) StBlock
    | StFor      (Lexeme Identifier) (Lexeme Expression)  StBlock
    | StBreak
    | StContinue

And many more. You may have noticed the repeating Lexeme a in many of the constructors.

Lexeme is the following data

type Position = (Int, Int)

data Lexeme a = Lex
    { lexInfo :: a
    , lexPosn :: Position
    }

So it works for keeping the information of the Position of an element in the program's file, for reporting errors and warnings.

Is there an easier way to deal with the keeping the information of the Position problem?

chamini2
  • 2,820
  • 2
  • 24
  • 37

2 Answers2

3

I'm accustomed to seeing another constructor that can be used optionally to hold lexical information:

data Expression = ... all the old Exprs
                | ExprPos Position Expression

data Declaration = ... decls ...
                 | DeclPos Position Declaration

Now in your Statement and other data types instead of things like:

| StFor      (Lexeme Identifier) (Lexeme Expression)  StBlock

you have:

| StFor      Identifier Expression StBlock
Thomas M. DuBuisson
  • 64,245
  • 7
  • 109
  • 166
  • This is an interesting approach – chamini2 Sep 05 '14 at 23:04
  • 1
    +1 Yes, once you get past parsing, you don't necessarily need such fine-grained position checking as you have; With this set-up, you can pick and choose when to store pos info. – jpaugh Sep 06 '14 at 02:47
2

One could move "up" the Lexeme application:

type Access = Lexeme Access'
data Access' = ...
type Expression = Lexeme Expression'
data Expression' = ...
-- etc.
data Statement
    -- Language
    = StNoop
    | StAssign Access Expression
    -- Definitions
    | StDeclaration      Declaration
    | StDeclarationList  (DeclarationList Expression')  -- maybe you can also use Expression here?
    | StStructDefinition DataType
    ...

In this way you apply Lexeme once every type definition, instead of once every type use.

chi
  • 111,837
  • 3
  • 133
  • 218