I took the example below partially from SO and changed it to my needs. It almost fits, but what I want to do is that always the first string in the commaSep expr is parsed as identifier whilst all subsequent strings should be strings only.
Currently they are all parsed as Identifiers.
*Parser> parse expr "" "rd (isFib, test2, 100.1, ?BOOL)"
Right (FuncCall "rd" [Identifier "isFib",Identifier "test2",Number 100.1,Query "?BOOL"])
I have tried a number of solutions that in the end all would break down to parsing the whole input without using commaSep. Means I would have to ignore the structure and do something like
expr_parse = do
name <- resvd_cmd
char '('
skipMany space
worker <- ident
char ','
skipMany1 space
args <- commaSep expr --not fully worked this out yet
query <- theQuery
skipMany space
char ')'
return (name, worker, args, query)
that looks less optimal and very clunky to me. Is there any way to refactor expr
in the code below, achive what I need and keep it simple?
module Parser where
import Control.Monad (liftM)
import Text.Parsec
import Text.Parsec.String (Parser)
import Lexer
import AST
expr = ident <|> astring <|> number <|> theQuery <|> callOrIdent
astring = liftM String stringLiteral <?> "String"
number = liftM Number float <?> "Number"
ident = liftM Identifier identifier <?> "WorkerName"
questionm :: Parser Char
questionm = oneOf "?"
theQuery :: Parser AST
theQuery = do first <- questionm
rest <- many1 letter
let query = first:rest
return ( Query query )
resvd_cmd = do { reserved "rd"; return ("rd") }
<|> do { reserved "eval"; return ("eval") }
<|> do { reserved "read"; return ("read") }
<|> do { reserved "in"; return ("in") }
<|> do { reserved "out"; return ("out") }
<?> "LINDA-like Tuple"
callOrIdent = do
name <- resvd_cmd
liftM (FuncCall name)(parens $ commaSep expr) <|> return (Identifier name)
AST.hs
{-# LANGUAGE DeriveDataTypeable #-}
module AST where
import Data.Typeable
data AST
= Number Double
| Identifier String
| String String
| FuncCall String [AST]
| Query String
deriving (Show, Eq, Typeable)
Lexer.hs
module Lexer (
identifier, reserved, operator, reservedOp, charLiteral, stringLiteral,
natural, integer, float, naturalOrFloat, decimal, hexadecimal, octal,
symbol, lexeme, whiteSpace, parens, braces, angles, brackets, semi,
comma, colon, dot, semiSep, semiSep1, commaSep, commaSep1
)where
import Text.Parsec
import qualified Text.Parsec.Token as P
import Text.Parsec.Language (haskellStyle)
lexer = P.makeTokenParser ( haskellStyle
{P.reservedNames = ["rd", "in", "out", "eval", "take"]}
)
identifier = P.identifier lexer
reserved = P.reserved lexer
operator = P.operator lexer
reservedOp = P.reservedOp lexer
charLiteral = P.charLiteral lexer
stringLiteral = P.stringLiteral lexer
natural = P.natural lexer
integer = P.integer lexer
float = P.float lexer
naturalOrFloat = P.naturalOrFloat lexer
decimal = P.decimal lexer
hexadecimal = P.hexadecimal lexer
octal = P.octal lexer
symbol = P.symbol lexer
lexeme = P.lexeme lexer
whiteSpace = P.whiteSpace lexer
parens = P.parens lexer
braces = P.braces lexer
angles = P.angles lexer
brackets = P.brackets lexer
semi = P.semi lexer
comma = P.comma lexer
colon = P.colon lexer
dot = P.dot lexer
semiSep = P.semiSep lexer
semiSep1 = P.semiSep1 lexer
commaSep = P.commaSep lexer
commaSep1 = P.commaSep1 lexer