Questions tagged [ocamllex]

ocamllex is a Lexer generator for OCaml

ocamllex is the lexer generator for OCaml. It is based on lex (now better known as flex).

It is used within the OCaml compiler, and thus, it is installed by the official OCaml distribution. ocamllex is powerful and nicely integrated within OCaml.

For simple uses, a simple lexer generator is also available in the Genlex module of the standard library, and allows to lex files with a syntax similar to OCaml ( (..) for comments, same escape sequences in strings, etc.).

References:

See also:

94 questions
7
votes
1 answer

Using Ocamllex for lexing strings (The Tiger Compiler)

I'm trying to follow Appel's "Modern Compiler Implementation in ML" and am writing the lexer using Ocamllex. The specification asks for the lexer to return strings after translating escape sequences. The following code is an excerpt from the…
nimrodm
  • 23,081
  • 7
  • 58
  • 59
7
votes
1 answer

Return multiple tokens in ocamllex

Is there any way to return multiple tokens in OCamlLex? I'm trying to write a lexer and parser for an indentation based language, and I would like my lexer to return multiple DEDENT tokens when it notices that the indentation level is less than it…
Joe Bloggs
  • 571
  • 3
  • 6
  • 14
7
votes
1 answer

Is it possible in ocamllex to define a rule that looks ahead at the next character without consuming it?

I'm using ocamllex to write a lexer for a scripting language but I am facing a conflict with my rule for comments. I want to allow my command arguments to be unquoted as long as they only contain alphanumeric characters and slashes "/". For…
hugomg
  • 68,213
  • 24
  • 160
  • 246
6
votes
1 answer

Lua long strings in fslex

I've been working on a Lua fslex lexer in my spare time, using the ocamllex manual as a reference. I hit a few snags while trying to tokenize long strings correctly. "Long strings" are delimited by '[' ('=')* '[' and ']' ('=')* ']' tokens; the…
raine
  • 817
  • 1
  • 15
  • 26
6
votes
1 answer

multiple error reporting with menhir: which token?

I am writing a small parser with Menhir + Ocamllex and I have two requirements I cannot seem to meet at the same time I would like to keep parsing after an error (to report more errors). I would like to print the token at which the error…
orm
  • 2,835
  • 2
  • 22
  • 35
5
votes
1 answer

OCamllex matching beginning of line?

I am messing around writing a toy programming language in OCaml with ocamllex, and was trying to make the language sensitive to indentation changes, python-style, but am having a problem matching the beginning of a line with ocamllex's regex rules. …
user35288
5
votes
2 answers

OCamlLex case-insenstitive

Is there a way to have case in-sensitive token in Ocamllex specification? I already tried to make case in-sensitive token in this way: let token = parser ... | ['C''c']['A''a']['S''s']['E''e'] { CASE } ... but I'm searching for something…
5
votes
2 answers

How can I match strings using "match with" and regex in OCaml?

My OCaml .ml code looks like this: open Str let idregex = Str.regexp ['a'-'z' 'A'-'Z']+ ['a'-'z' 'A'-'Z' '0'-'9' '_']*; let evalT (x,y) = (match x with Str.regexp "Id(" (idregex as var) ")" -> (x,y) Why does the above code not work? How can…
P.C.
  • 651
  • 13
  • 30
4
votes
4 answers

OCaml lex: doesn't work at all, whatsoever

I am at the end of my rope here. I cannot get anything to work in ocamllex, and it is driving me nuts. This is my .mll file: { open Parser } rule next = parse | (['a'-'z'] ['a'-'z']*) as id { Identifier id } | '=' { EqualsSign } | ';' {…
marsolk
  • 913
  • 1
  • 8
  • 10
4
votes
2 answers

Ocamllex - What is the difference between characters ? ( # )

They've an operator with ocamllex which is the #: difference between two characters or character sets. Here, there is a notion I don't understand: it is the difference between characters. What does mean the difference between characters? So if…
afk
  • 563
  • 2
  • 5
  • 12
3
votes
1 answer

Translation from Python to CIL(C Intermediate Language)

I have worked on the static analysis on Python source code recently. There is already a static analyzer written in Ocaml for CIL(C Intermediate Language) in our group. We want to reuse this analyzer, so our ideal approach is to translate Python to…
Yao
  • 31
  • 2
3
votes
2 answers

Make a table containing tokens visible for both .mly and .mll by menhir

I would like to define a keyword_table which maps some strings to some tokens, and I would like to make this table visible for both parser.mly and lexer.mll. It seems that the table has to be defined in parser.mly, %{ open Utility (* where…
SoftTimur
  • 5,630
  • 38
  • 140
  • 292
3
votes
1 answer

Example of grammar (lex/yacc) for tree description

I want to parse a tree from a file that will describe this tree (which actually is a taxonomy). I am looking for examples of grammar (ideally lex/yacc files) that provides description of trees. It would be better if that described trees were not…
3
votes
2 answers

Using ocamllex/ocamlyacc to parse part of a grammar

I've been using regexes to go through a pile of Verilog files and pull out certain statements. Currently, regexes are fine for this, however, I'm starting to get to the point where a real parser is going to be needed in order to deal with nested…
aneccodeal
  • 8,531
  • 7
  • 45
  • 74
2
votes
1 answer

Specifying ocamllex encoding

I'm currently developing a parser according to a specification, and I'm completely unable to find anywhere in the docs information about text encoding. It sounds weird to me that the docs of a lexing library wouldn't mention text encoding at all, so…
MrAnima
  • 555
  • 1
  • 4
  • 13
1
2 3 4 5 6 7